•:->:•: 






^'t'.'.W.V.V.V 







5# 











>^ 











UNITED STATES DEPARTMENT OF COMMERCE 
United Staires Patent and Trademark Office 



Jahua^ 04, 2005 






■r 

Mi. 

lii 



.•.■.'..•awwsjS','; 



APPLICATION NUMBER: 60/491,822 

mmmmmmEk juiy 31, 2003 

RELATED PCT APPLICATION NUMBER: PCT/VS04/24868 








Certified By 



.V.W.w.'.v. 



//AV.V.'.'AV.V.V.'AV.V.V.'. 



:•>:•:•:•*•:•*•*•:•:•■•*•*••••-•-•■*- 



: :-x*x-x-x-x-x-:-xv-x-x-:-x-: 

Jon WDudas 



Under Secretary 

of Commerce for Intellectual Property 

and Acting: Dir ector of the 

Unites States Patent and Trademark Office 







"Express Mail" Mailing Label No. EV 351 355 394 US 

I hereby certify that this paper or fee is being deposited with the United States Postal Service "Express Mail Post Office to Add^essee ,, service 



Date ot Deposit. Jury 3 1 v zuuj h 




above and is addressed to the Commissioner of Patents & Trademarks, Washington, D.C 2023 1 . M 

=>: 

in 
<n 
to 
in 



Date: July 31, 2003 
PROVISIONAL APPLICATION COVER SHEET 



Customer No. 27476 


Docket No. 20663 .001 


Type a plus sign (+) inside this 
box-^ + 


INVENTOR(S)/APPLICANnXS) 


LAST NAME 


FIRST NAME 


MIDDLE INITIAL 


RESIDENCE (CITY AND EITHER 
STATE OR FOREIGN COUNTRY) 


Grand! 


Guido 






Telford 


John 






Bensi 


Giuliano 






TITLE OF INVENTION (280 characters max) 


Immunogenic Compositions for Streptococcus pyogenes 


CORRESPONDENCE ADDRESS 


Rebecca M. Hale 
CHIRON CORPORATION 
Intellectual Property - R440 
P.O. Box 8097 
Emeryville 


STATE: California 


ZIP CODE: 94662-8097 


COUNTRY: USA . 


ENCLOSED APPLICATION PARTS (check all that apply) 


_X Specification 


Number of Pages: $8 


Drawing(s) 


Number of Pages: 


Small Entity Statement 


Other (specify) 


METHOD OF PAYMENT (check one) 


X A check or money order is enclosed to cover the Provisional filing fees 


_X The Commissioner is hereby authorized to charge any 
additional fees and credit Deposit Account Number 03-1664. 


PROVISIONAL FILING FEE AMOUNT ENCLOSED $160.00 
CHECK NO. 8182 



The invention was made by an agency of the United States Government or under a contract with an agency of the United States 
government. 
X No 

Yes, the name of the U.S. Government agency and the Government contract number are: 



July 3 1,2003 

CHIRON CORPORATION 

Intellectual Property - R440 

P.O. Box 8097 

Emeryville, CA 94662-8097 

(5 1 0) 923-3 179 -(510) 655-3542 (fax) 



Respectfully submitted, 



By: 



Rebecca M. Hale 
Attorney for Applicants 
Reg. No. 45,680 



IM3MUNOGENIC COMPOSITIONS FOR STREPTOCOCCUS PYOGENES 
All documents cited herein are incorporated by reference in their entirety. 

TECHNICAL FIELD 

This invention is in the fields of immunology and vaccinology. In particular, it relates to antigens 

5 derived from Streptococcus pyogenes and their use in immunisation. 

♦ 

BACKGROUND ART 

Group A streptococcus ("GAS", S.pyogenes) is a frequent human pathogen, estimated to be 
present in between 5-15% of normal individuals without signs of disease. When host defences 
are compromised, or when the organism is able to exert its virulence, or when it is introduced to 
10 vulnerable tissues or hosts, however, an acute infection occurs. Related diseases include 

puerperal fever, scarlet fever, erysipelas, pharyngitis, impetigo, necrotising fasciitis, myositis and 
streptococcal toxic shock syndrome. 

Although S.pyogenes may be treated using antibiotics, a prophylactic vaccine to prevent the onset 
of disease is desired. Efforts to develop such a vaccine have been ongoing for many decades. 
15 While various GAS vaccine approaches have been suggested and some approaches are currently in 
clinical trials, to date, there are no GAS vaccines available to the public. 

It is an object of the invention to provide further and improved compositions for providing immunity 
against GAS disease and/or infection. The compositions are based on a combination of two or more 
(eg. three or more) GAS antigens. 

20 DISCLOSURE OF THE INVENTION 

Applicants have discovered a group of thirty GAS antigens that are particularly suitable for 
immunisation purposes, particularly when used in combinations. The invention therefore provides an 
immunogenic composition comprising a combination of GAS antigens, said combination consisting 
of two to thirty-one GAS antigens of a first antigen group, said first antigen group consisting of: GAS 

25 1 17, GAS 130, GAS 277, GAS 236, GAS 40, GAS 389, GAS 504, GAS 509, GAS 366, GAS 159, 
GAS 217, GAS 309, GAS 372, GAS 039, GAS 042, GAS 058, GAS 290, GAS 51 1 , GAS 533, GAS 
527, GAS 294, GAS 253, GAS 529, GAS 045, GAS 095, GAS 193, GAS 137, GAS 084, GAS 384, 
GAS 202, and GAS 057. These antigens are referred to herein as the 'first antigen group*. 
Preferably, the combination of GAS antigens consists of three, four, five, six, seven, eight, nine, or ten 

30 GAS antigens selected from the first antigen group. Preferably, the combination of GAS antigens 
consists of three, four, or five GAS antigens selected from the first antigen group. 

GAS 40 and GAS 1 17 are particularly preferred GAS antigens. Preferably, the combination of GAS 
antigens includes either or both of GAS 40 and GAS 1 17. Representative examples of some of these 
antigen combinations are discussed below. 
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The combination of GAS antigens may consist of three GAS antigens selected from the first antigen 
group. Accordingly, in one embodiment, the combination of GAS antigens consists of GAS 40, GAS 
11 7 and a third GAS antigen selected from the first antigen group. In another embodiment, the 
combination of GAS antigens consists of GAS 40 and two additional GAS antigens selected from the 
5 first antigen group. In another embodiment, the combination of GAS antigens consists of GAS 1 17 
and two additional GAS antigens selected from the first antigen group. 

The combination of GAS antigois may consist of four GAS antigens selected from the first antigen 

■ 

group. In one embodiment, the combination of GAS antigens consists of GAS 40, GAS 1 17 and two 
additional GAS antigens selected from the first antigen group. In one embodiment, the combination 
1 0 of GAS antigens consists of GAS 40 and three additional GAS antigens selected from the first antigen 
group. In one embodiment, the combination of GAS antigens consists of GAS 1 1 7 and three 
additional antigens selected from the first antigen group. 

The combination of GAS antigens may consist of five GAS antigens selected from the first antigen 
group. In one embodiment, the combination of GAS antigens consists of GAS 40, GAS 1 17 and three 
1 5 additional GAS antigens selected from the first antigen group. In one embodiment, the combination 
of GAS antigois consists of GAS 40 and four additional GAS antigens selected from the first antigen 
group. In one embodiment, the combination of GAS antigens consists of GAS 117 and four 
additional GAS antigens selected from the first antigen group. 

The combination of GAS antigens may consist of eight GAS antigens selected from the first antigen 
20 group. In one embodiment, the combination of GAS antigens consists of GAS 40, GAS 1 1 7 and six 
additional GAS antigens selected from the first antigen group. In one embodiment, the combination 
of GAS antigens consists of GAS 40 and seven additional GAS antigens selected from the first 
antigen group. In one embodiment, the combination of GAS antigens consists of GAS 1 17 and seven 
additional GAS antigens selected from the first antigen group. 

25 The combination of GAS antigens may consist of ten GAS antigens selected from the first antigen 

group. In one embodiment, the combination of GAS antigens consists of GAS 40, GAS 1 17 and eight 
additional GAS antigens selected from the first antigen group. In one embodiment, the combination 
of GAS antigens consists of GAS 40 and nine additional GAS antigens selected from the first antigen 
group. In one embodiment, the combination of GAS antigens consists of GAS 117 and nine 

30 additional GAS antigens selected from the first antigen group. 

Each of the GAS antigens of the first antigen group are described in more detail below. Genomic 
sequences of at least three GAS strains are publicly available. The genomic sequence of an Ml GAS 
strain is reported at Ref. 1 . The genomic sequence of an M3 GAS strain is reported at Ref. 2. The 
genomic sequence of an M18 GAS strain is reported at Ref. 3. Preferably, the GAS antigens of the 
35 invention comprise polynucleotide or amino acid sequence of an Ml, M3 or M18 GAS strains. More 
preferably, the GAS antigens of the invention comprise a polynucleotide or amino acid sequence of an 
Ml strain. 
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(1) GAS 117 

GAS 117 corresponds to Ml GenBank accession numbers GI:13621679 and GI: 15674571, to M3 
GenBank accession number GI:21909852, to M18 GenBank accession number GI: 19745578, and is 
also referred to as 'SpyrfMS' (Ml), *SpyM3_0316* (M3), and 'SpyM18_049r (Ml 8). Examples of 
5 amino acid and polynucleotide sequences of GAS 1 1 7 of an Ml strain are set forth below: 

SEQ ID NO: 1 

MTLKKHYYLLSLLALVTVGAAFNTS 

1X5RHYSSYYYYNLRTVMGLSSEQDIEKHYEELKNKLHDMYNHY 

10 SEQ ID NO: 2 

ATGACACTAAAAAAACACTATTATCTTCTCAGCCTGCTAGCTCTTC^ 

CAAGC(^GAGTGTCAGTGCACAAGTTTATAGCAATGAAGGGTATCAC^ 

ACACCTGGAATATAGTAAAGACAACGCAGAACTTCAATTC 

CTAGGGAGACACTACTCTAGCTATTATTACTACAACCTAAGAACCGTTATGGGACTATCAAGTGAGCAAG 
1 5 ACATTGAAAAACACTATGAAGAGCTTAAGAACAAGTTACATGATATGTACAATCATTAT^ 

Preferred GAS 117 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 1; and/or (b) which is a fragment of at least n . 

20 consecutive amino acids of SEQ ID NO: 1, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100 or more). These GAS 117 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 1. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 1. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 

25 acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID NO: 1 . For 
example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 1 
is removed Other fragments omit one or more domains of the protein (eg. omission of a signal 
peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(2) GAS 130 

* 

30 GAS 130 corresponds to Ml GenBank accession numbers GI.13621794 and GL15674677, to M3 
GenBank accession number GI: 21909954, to Ml 8 GenBank accession number GI: 19745704, and is 
also referred to as *Spy059r (Ml), 'SpyM3_0418' (M3), and 4 SpyM18_0660' (M18). GAS 130 has 
potentially been identified as a putative protease. Examples of amino acid and polynucleotide 
sequences of GAS 130 of an Ml strain are set forth below: 

35 SEQ ID NO: 3 

MSHMKKRPEVLSPAGTLEKLKVAIDYGADAVFVGGQAYGLRSRAGNFSMEEI^EGIDYAHARGAKVYV^ 
NMVTHEGNEIGAGEWFRQLRDMGLDAVIVSDPALIVICSTEAPGLEIHLSTQASSTNYETPEFWKAMGLT 
RWLAREVNMAELAE IRKRTDVEI EAFVHGAMCI S YSGRCVLSNHMSHRDANRGGCSQSCRWKYDLYDMP 
FGGERRSLKGEI PEDYSMSSVDMCM I DH I PDLI ENGVDSLKI EGRMKS I HYVSTVTNCYKAAVGAYMES P 
40 EAFYAI KEELI DELWKVAQRELATGFYYGI PTENEQLFGARRKI PQYKFVGE WAFDSASMTATI RQRNV 
IMEGDR I ECYGPGFRHFETVVKDLHDADGQKI DRAPN PMELLTI S LPREVKPGDM I RACKEGLVNLYQKD 
GTSKTVRT 

SEQ ID NO: 4 

45 ATGTCACATATGAAAAAACGTCCCGAGGTCTTATCACCTGCT 
TTGACTATGGCGCAGATGCTGTTTTTGTTGGAGGGCAGGCCT 
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CTCTATGGAAGAATTGCAAGAAGGCATTGATTATGCACATGCGCG^^ 
AACATGGTTACCCACGAAGGGAACGAAATTGGTGCGGGCGAGTGGTTTQ3 
TTGATGCGGTCATTGTTTCAGATCCAGCCTTGATTGTTATTTC 
TCATTTGTCAACGCAAGCTTCATCTACCAATTACGAGACC 
5 CGAGTTGTTTTAGCTCGCX^GGTTAATATC^ 

TTGAAGCCTTTGTCCATGGAGCCATGTGTATCTCTTATTCAGGCCG^ 

TC^CCGTGATGCCAACAGGGGCGGCTGCTCACAGTCTTGCCGCTOGAAGTATGAT^ 
TTTGGAGGAGAGCGCCGCTCCTTAAAAGGGGAAATTCCAGAAGACT 

GTATGATTGACCATATTCCTGACCTGATTGAAAATGGGGTTGATAGC 

1 0 ATCTATCCACTACX^TCTCAACCGTAACCAACTGTTACAAGGCGGCTGTAGGTGCTTACATGGAAAGCC^ 

GAAGCTTTTTATGCTATCAAAGAGGAATTGATTGACX^ 

CAGGTTTTTACTATGGTATCCCAACTGAAAATGAACAATTATTTGGTGCT 

TAAATTTGTCGGAGAAGTAGTTGCCTTTGACrrCAGCTAGCATGACAGCGACC^ 

ATCATGGAAGGCGATCGGATTGAATGTTATGGACCAGGTTTCCGTCATTTTG 
1 5 TACATGATGCGGATGGCCAAAAGATTGACCGTGCCCCAAATC 

GAGAGAAGTTAAGCCAGGGGATATGATTAGGGCTTGCAAGGAAGGTCTGGTTAACCTCT 
GGCACCAGTAAAACTGTTAGAACATAG 

Preferred GAS 1 30 proteins for use with the invention comprise an amino acid sequence: (a) having 
20 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 3; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 3, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, or more). These GAS 130 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralpgs, mutants, etc) of SEQ ID NO: 3. Preferred fragments 
25 of (b) comprise an epitope from SEQ ID NO: 3. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 3. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of 
a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

30 (3) GAS 277 

GAS 277 corresponds to Ml GenBank accession numbers GI:13622962 and GI:15675742, to M3 
GenBank accession number GI: 2191 1206, to M18 GenBank accession number GI: 19746852, and is 
also referred to as 'Spyl939' (Ml), *SpyM3 J670* (M3), and 4 SpyM 18_2006* (M18). Amino acid 
and polynucleotide sequences of GAS 277 of an Ml strain are set forth below: 

35 SEQ ID NO: 5 

M TTMQKTI SLLSLALLIGLLGTSGKAI SVYA QDQHTDNVI ABSTI SOVS VBASMRGTE PY I DATVTTDQP 

VRQPTQATITLKDASDNTINSWVYTMAAQQRRFTAW 

QNKARKTPTNMQQKDTSKAMTNSVDVDTKA 

^ ASNSQKNGSNKTKMLVDKEEVKPTSKRGFPWVLLGLWSLAAGLFIAIQKVSRRK 

SEQ ID NO: 6 

ATGACAACTATGCAAAAAACAATTAGCTTATTATCACTAG 
GCAAAGCCATATCTGTGTATGCACAAGATCAGCACACTGATAATC 
GGTCAGTGTTGAAGCCAGTATGCGTGGAACAGAACCTTATATTGATGCT 
45 GTCAGACAACCAACTCAGGCAACGATAACACTTAAAGACX3CTAGTC 
ATACTATGGCAGCGCAACAGCGTCGTTTTACAGCTTGGTT 

TCATGTAACTGTCACCGTTCATACTCAAGAAAAGGCAGTAA 

CAAAACAAAGCTAGAAAAACACCAACTAATATGCAACAAAAGGATACTTCTAAAGCAATGACGAA 
TCGATGTAGACACAAAAGCTCAAACAAATCAATCAGCTAACCAAGA 

50 CAGATCAGCTACTAATCATCGATCAACTTCCTTAAAGC 

GCTAGTAATAGCCAAAAAAACGGTAGCAACAAGACAAAAATGCTAGTGGACAAAGAGGAAGTAAAACCTA 
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CTTCAAAAAGAGGATTCCCTTGGGTCTTATTAGGT 
TATTCAAAAAGTATCTAGACGAAAATAA 

Preferred GAS 277 proteins for use with the invention comprise an amino acid sequence: (a) having 
5 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 5; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 5, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 1 8, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These GAS 277 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 5. Preferred fragments of (b) 
10 comprise an epitope from SEQ ID NO: 5. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 5. For 
example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 5 
is removed. Other fragments omit one or more domains of the protein (eg. omission of a signal 
1 5 peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(4) GAS 236 

GAS 236 corresponds to Ml GenBank accession numbers GI:13622264 and GI:15675106, M3 
GenBank accession number GI: 21910321, and to M18 GenBank accession number GI: 19746075, 
and is also referred to as 'Spyl 126' (Ml), 'SpyM3_0785 > (M3), and 4 SpyM18_1087' (M18). Amino 
20 acid and polynucleotide sequences of GAS 236 from an Ml strain are set forth below: 

SEQ ID NO: 7 

MT^WYTGKVKRVAIIANGKYQSKRVASKLFSVFKDDPDFYLSKKNPDIVISIGGDGMLLSAFHMYBKEL 
DKVRFVGIHTGHLGFYTDYRDFEVDKLIDNLRKD 

KTMVADVI INHVKFESFRGDGISVSTPTGSTAYNKSLGGAVLHPTI EALQLTEI SSLNNRVFRTLGSSI I 
25 IPKKDKIELVPKRLGIYTISIDNKTYQLKNVTK^ 

SEQ ID NO: 8 

ATGACACAGATGAATTATACAGGTAAGGTAAAACGAGTTGCTATTATTGCAAATGGTAAGTACCA 
AACGCGTCGCCTCCAAACTTTTCTCCGTATTTAAAGATGATC 
30 GGATATTGTGATTTCTATTGGCGGAGATGGGATGCTCTTATCTGCCTTTCACATGTATGAAAAAGAATTA 
GATAAGGTACGTTTTGTAGGAATCCACACCGGTCATCTTGGCTTTTATACCGATTATAGGGATTTTGAAG 
TTGATAAATTAATTGATAATTTAAGAAAAGACAAGGGAGAACA 

TATTACTTTAGATGATGGTCGTGTGGTTAAAGCGCGTGCTTTGAATGAAGCGACGGTTAAGCGTATTGAA 
AAAACGATGGTAGCAGATGTTATTATTAACCATGTCAAATTTG 

3 5 TATCGACCCCGACAGGGAGCACAGCCTACAATAAATCTTTAGGTGGTGCTGTCTTGCATCCGACGATTGA 
AGCGCTGCAATTGACGGAAATTTCCAGTCTTAATAACCGTGTCTTTAGAACCTTGGGCTCATCAATCATT 
ATTCC C AAAAAAGAT AAGATTGAGTTAGTGCCAAAACGATT AGGAATTTATAC C ATTTCCATTGATAATA 
AAACCTATCAGTTAAAAAATGTGACGAAGGTGGAGTATTTTATCGACGATGAGAAAATTCATTTTGTTTC 
CfCTCCGAGTCATACGAGCTTTTGGGAAAGGGTCAAGGATGCCTTTATTGGAGAGATTGACTCATGA 

40 

Preferred GAS 236 proteins for use with the invention comprise an amino acid sequence: (a) having 

50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 7; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 7, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

45 30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS 236 proteins include variants (eg. allelic 

variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 7. Preferred fragments of (b) 

comprise an epitope from SEQ ID NO: 7. Other preferred fragments lack one or more amino acids 
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(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID NO: 7. For 
example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 7 
is removed Other fragments omit one or more domains of the protein (e.g. omission of a signal 
5 peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(5) GAS 040 

GAS 040 corresponds to Ml GenBank accession numbers GL13621545 and GI:15674449, to M3 
GenBank accession number GI: 21909733, to M18 GenBank accession number GI:19745402, and is 
also referred to as l Spy0269' (Ml), *SpyM3_0197' (M3), 'SpyMlSJ^' (M18) and 'prgA\ GAS 
10 040 has also been identified as a putative surface exclusion protein. Amino acid and polynucleotide 
sequences of GAS 040 from an Ml strain are set forth below: 

SEQ ID NO: 9 

MDLEQTKPNQVKQKI ALTSTIALLSA SVGVSHQVKADDRASGBTKASNTHDDSL PKPETT OB A K* AT Tn A\7 
EKTLSQQKAELTEIiATALTKTTAEINHLKEQQDNEQ 

1 5 TETELHNAQADQHSKETALSEQKASISAETTRAQDLVEQVKTSEQNIAKLNAMI SNPDAITKAAQTANDN 
TKALSSELEKAKADLENQKAKVKKQLTEELA^ 

PLBELKKLEASGYI GSAS YNNYYKEHADQI I AKASPGNQLNQYQDI PADRNR FVDPDNLTPEVQNBLAQF 
AAHMINSVRRQLGLPPVTVTAGSQEFARLLSTSYKKraGNTR I EDSA 

GASGL I RNDDNM YEN I GAFND VHTVNG I KRGI YDS I KYMLFTDHLHGNT YGHAINFLRVDKHN PNAPVYL 
20 GFSTS^GSLNEHFVMFPESNIANHQRFNKTPIKAVGSTKDYAQRVGWSDTIAAIKGK^ 
HQEADIMAAQAKVSOLQGKLASTLKQSDSLNLQVRQLiroT 
SLKAALHQTEALAEQAAARVTALVAKKAHLOYLRDFKLN^ 

LAALQAKQSSLEATI ATTEHQLTLLKTLANE KE YRHLDEDI ATVPDLQVAPPLTGVKPLSYSKI DTTPLV 

QEMWETKQLLEASARLAAENTSLVAEALVGQTSEMVASNAIVSKITSSITQPSSKTSYGSGSSTTSNLI 
25 SDVDESTQR ALKAGVVMLAAVGLTGFRFRKSSK 

SEQ ID NO: 10 

ATGGACTTAGAACAAACGAAGCCAAACCAAGTTAAGCAGAAAAT^ 

TGAGTGCCA GTCTAGGCGTATCTCACCAAGTCAAAGCAGATGATAflAar PTr ana Aa a, ft ArQftftgocqAG 
30 TAATACTCACGACGATAGTTTACCAAAACCAGAAAC^^ 

GAAAAAACTCTCAGTCAACAAAAAGCAGAACTCACAGAGCTTCCT 
AAATCAACGACTTAAAAGAGCAGCAAGATAATGAACAAAAAGCTTTAACCTCTGCAC 

TAATACTCTTGCAAGTAGTGAGGAGACGCTATTAGCCCAAGGAGCCGAACATCAAAGAGA 
ACTGAAACAGAGCTTCATAATGCTCAAGCAGATCAACATTC^ 
35 CTAGCATTTCAGCAGAAACTACTCGAGCTC^ 

TGCTAAGCTCAATGCTATGATTAGCAATCCTGATGCTATCACTAAAGCAGCTCAAACGGCTAA 
ACAAAAGCATTAAGCTCAGAATTGGAGAAGGCTAAAGCTGACT^ 

AGCAATTGACTGAAGAGTTGGCAGCTCAGAAAGCTXX^T 

TAAATCCTC^GCTCCGTCTACTCAAGATAGCATTG 

40 CCTCTTGAAGAACTTAAAAAATTAGAAGCTAGTGGTTATATTGG^ 

AAGAGCATGCAGATCAAATTATTGCCAAAGCTAGTCCAGGTAATCAATTAAATCAATACCAAGATATTCC 
AGCAGATCGTAATCGCTTTGTTGATCCCGATAATTTGACACCAGAAGTGCAAAATC 

GCAGCTCACATGATTAATAGTGTAAGAAGACAATTAGGTCTACCACCAGTTACTCT 
AAGAATTTGCAAGATTACTTAGTACCAGCTATAAGAAA^^ 

45 CGGACAGCCAGGGGTATCAGGGCATTATGGTGTTGGGCCTCATGATAAAACTATTATTGAAGAC 

GGAGCGTCAGGGCTCATTCGAAATGATGATAACATGTACGAGAATATCGGTGCTTTTAA 
CTGTGAATGGTATTAAACGTGGTATTTATGAC^ 

AAATACATACGGCCATGCTATTAACTTTTC^ 
GGATTTTCAACCAGCAATGTAGGATCTTTGAATGAACACTT^ 
50 ACCATCAACGCTTTAATAAGACCCCTATAAAAGCCGTTGG 

CACTGTATCTGATACTATTGCAGCGATCAAAGGAAAAGTAAGC^ 

CATCAAGAAGCTGATATTATGGCAGCCCAAGCTAAAG 

TTAAGCAGTCAGACAGCTTAAATCTCCAAGTGAGACAATTAAAT^ 
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ATTACTAGCAGCTAAAGCAAAACAAGCACAACTCGAAGCTACTCGT^ 
TCX3TTGAAAGCCGCACTGCACCAGACAGAAGCCTTAGCAG 
TGGCTAAAAAAGCTCATTTGCAATATCTAAGGGACTTTAAATTGAATCCT 
TGAGCGCATTGATAATACTAAGCAAGATTTGGCTAAAACTACCT 
5 TTAGCAGCCTTACAAGCTAAACAAAGC^GTCTAGAAGCTACTATTGCT 

TGCTTAAAACCTTAGCTAACGAAAAGGAATATCGCCACTTAGACGAAGATATAGCTACTGTC 
GCAAGTAGCTCCACCTCTTACGGGCGTAAAACCGCTATCATATAGTAA^ 
CAAGAAATGGTTAAAGAAACGAAACAACTATTAGAAGCTTCAGCAAGACT 
TTGTAGCAGAAGCGCTTGTTGGCCAMCCrCTGAAATGGTAGCAAGTAATG 
1 0 ATCTTCGATTACTCAGCCCTCATCTAAGACATCCT 

TCTGATGTTGATGAAAGTACTCAAAGA GCTCTTAAAGCAGGAGTCCT 
CAGGATTTAGGTTCCGTAAGGAATCTAAGTGA 

Preferred GAS 040 proteins for use with the invention comprise an amino acid sequence: (a) having 
15 50% or more identity (e.& 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 9; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 9, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250 or more). These GAS 040 proteins include variants 
(e.g. allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ED NO: 9. Preferred 
20 fragments of (b) comprise an epitope from SEQ ID NO: 9. Other preferred fragments lack one or 
more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-teiminus and/or one 
or more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ 
ID NO: 9. For example, in one embodiment, the underlined amino acid sequence at the N-terminus of * 
SEQ ID NO: 9 is removed. As another example, in one embodiment, the underlined amino acid 
25 sequence at the C-terminus of SEQ ID NO: 9 is removed. Other fragments omit one or more domains 
of the protein {e.g. omission of a signal peptide, of a cytoplasmic domain, of a transmembrane 
domain, or of an extracellular domain). 

(6) GAS 389 

GAS 389 corresponds to Ml GenBank accession numbers GI: 13622996 and GI: 15675772, to M3 
30 GenBank accession number GI: 2191 1237, to M18 GenBank accession number GI: 19746884, and is 
also referred to as 'Spyl98r (Ml), 'SpyM3 J701' (M3), 'SpyM18_2045* (M18) and 'relA\ GAS 
389 has also been identified as a (p)ppGpp synthetase. Amino acid and polynucleotide sequences of 
GAS 389 from an Ml strain are set forth below: 

SEQ ID NO: 11 

35 MRNEMAKIMNVTGEEVIALAATYMTKADVAFVAKAL 
DAVTVACGFLHDWEDTDITLDEIEADFGHDARDITO 

VI LVKLADRLHNMRTLKHLRKDKQER I SRETMEI YAPLAHRLGI SR I KWELEDLAFR YLNETEFYKI SHM 
MKEKRREREALVEAI VSKVKTYTTQQGLFGDVYGRPKHI YSI YRKMRDKKKRFDQI FDLIAIRCVMBTQS 
DVYAMVGYI HELWRPMPGRFKDYI AAPKANGYQS IHTTVYGPKGPI EIQIRTKDMHQVAEYGVAAHWAYK 
40 KGVRGKVNQAEQAVGMNWI KELVELQDASNGDAVDFVDSVKEDI FS ER I YVFTPTGAVQELPKESGPIDF 
AYAIHTQIGEKATGAKVNGRMVPLTAKLKTGDVVEIITNA^ 

KELSVNKGRDLLVSYFQEQGYVANKYLDKKRIEAILPKVSVKSEESLYAAVGFGDISPISVFNKLTEKER 
REEERAKAKAEAEELVKGGEVKHENKDVLKVRSENGVI I QGASGLLMR I AKCCNPVPGDP I DGY I TKGRG 
I AI HRSDCHNI KSQDGYQERLIE VE WDLDNS S KDYQAE I DI YGLNRSGLLNDVLQI LSNSTKSISTVNAQ 
45 PTKDMKFAN IHVSFGI PNLTHLTTWEKI KAVPDVYSVKRTNG 

SEQ ID NO: 12 

ATGAGGAACGAAATGGCAAAAATAATGAACGTAACAGGAGAAGAAGTCATTG^ 
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An x Ktr inu. rrzuuuj.vui 

TGACCMGGCTGATGTGGCTTTTGTGGCAAAGGCTTT 

C^GAAAGTCAGGCGAACCCTATA TCGTCC ATCC 

GATGCTGTGAC^GTTGCTTGTGGCTTTTTA 

TCGAAGCAGACTTTGGCCATGATGCTCGTGATATCGTTGAT^ 
5 CAA ATCTCA TGAGGAGCAACTCGCCGAA^ 

GTGATTTTGGTGAAATTGGCTGACCGCCTGCATAA^ 

AAGAGCGCATTTCGCGCGAAACCATGGAAATCTATGCCCCCTTGGTO 
OVMTGGGAACrAGAAGATTTGGCTTTTCGTTACCT 

ATGAAAGAAAAACGTCGCGAGCGTGAAGCTTTGGTAGAGGCTATTGTCAGTAAGGTCAAAACCT 
1 0 CACAACAAGGGTTGTTTGGAGATGTGT ATGGC CGAC CAAAACACATTTATTCGATTTATCGGAAAATGCG 
GGACAAAAAGAAACGATTCGATCAGATTTTTGATCTGATTC 
GATGTCTATGCTATGGTTGGCTATATTCATGAGCTT 

TTGCAGCTCCTAAAGCTAATGGCTACOUSTCTATTCATACCACCGTGTATGGGCCAA^ 
GATTCAAATCAGAACTAAGGACATGCATCAAGTGGCTGAGTACGGGOTTGCTGCTC^ 
1 5 AAAGGCGTGCGTGGTAAGGTCAATCAAGCTGAGCAAGCCGTTGGCATGAACTGGA 
AATTGCAAGATGCCTCAAATGGCGATGCAGTGGACTTTGTGGATTCGGTCAAAG^ 

ACGGATTTATGTCTTTACACCGACAGGGGCCGTTCAGGAGTTACCAAAAGAATCAGCT 
GCTTATGCX^TCCATACGCAAATCGGTGAAAAAGCAACAGGTGCCAAAGTCAAT^ 

TCACTGCCAAGTTAAAAACAGGAGATGTGGTTGAAATC^TCACCAATC 
20 AGACTGGGTAAAACTGGTCAAAACCAATAAGGCTCGCAACAAAA1TC 

AAGGAATTGTCAGTGAATAAAGGCCGTGATTTGTTGGTGTCTTATTTTCAAGA 

ATAAATACCTTGACAAAAAACGCATTGAAGCCATCCTTCCAAAAGTCAGTGTGMGAGCGAAGAATCAC^ 
CTATGCAGCCGTTGGGTTTGGTGACATTAGTCCTATCAGTGTCTTTAACAAGTTAACCGAAAAAGAGCGC 
CGTGAAGAAGAAAGGGCCAAGGCTAAAGCAGAAGCTGAAGAATTGGTTAAGGGCGGTGAGGTC^ 
25 AAAACAAAGATGTGCTCAAGGTTCGCAGTGAAAATGGAGTCATTATCCAAGGAGC^ 

GCGGATTGCCAAGTGTTGTAATCCTGTACCTGGTGATCCTATTGACGGCTACATTACCAAAGGGCGTGGC 

ATTGCGATTCACAGATCGGACTGTCATAACATTAAGAGTCAAGATGGCTACCAAGAACGCT^ 

TCGAGTGGGATTTGGACAATTCGAGTAAAGATTATCAGGCT 

TGGTCTGCTTAATGATGTGCTCCAAATTTTATCAAACTCAACCAAGAGCATATCGACAGTCAATC 
30 CCGACGAAGGACATGAAGTTTGCTAATATTCACGTGAGCTTTGGCATTCCAAATCTO 

CTGTTGTCGAAAAAATCAAGGCAGTTCCAGATGTTTATAGCGTGAAGCGGACCAATGGCTAA 

Preferred GAS 389 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

35 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 1 1; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 11, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18,20,25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250 or more). These GAS 389 proteins include variants 
(e.g. allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 1 1. Preferred 
fragments of (b) comprise an epitope from SEQ ID NO: 1 1. Other preferred fragments lack one or 

40 more amino acids (e.g. 1,2,3,4,5,6,7,8,9, 10, 15,20,25 or more) from the C-terminus and/or one 
or more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ 
ID NO: 1 1 . Other fragments omit one or more domains of the protein (e.g. omission of a signal 
peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(7) GAS 504 

45 GAS 504 corresponds to Ml GenBank accession numbers GI: 1 3622806 and GI: 1 5675600, to M3 

GenBank accession number GI: 2191 1061, to M18 GenBank accession number GI: 19746708, and is 
also referred to as 'Spyl751' (Ml), 'SpyM3_1525\ 'SpyM18J823' (M18) and *fabK\ GAS 504 
has also been identified as a putative trans-2-enoyl-ACP reductase II. Amino acid and polynucleotide 
sequences of GAS 504 of an Ml strain are set forth below: 

50 SEQ ID NO: 13 
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Mil ko* rrzuuuj.v/ui 

MKTRI TBLLN I DY PI FQGGMAWVADGDLAGAVSN AGGLGI IGGGNAP KE WKAN I DRVKA I TDRPFGVNI 
MLLSPPAODI VDL VI EEGVKWTTGAGN PG KYMER LHQAG 1 1 WPWPSVALAKRMEKLGVDAVIAEGME 
AGGHI GKLTTMSLVRQ WEAVS I PVI AAGG I ADGHGAAAAFMLGAEAVQ I GTRFWAKE SNAHQN F KDKI 
LAAKDIDTVI S AQWGHPVRS I KNKLT S AY AKAE KAFL I GQKTATD I EEMGAGS LRHAV I EGDWNGSVM 
5 AGQIAGLVRKEESCETILKDIYYGAARVIQNEAKRWQSVSIBK * 

SEQ ID NO: 14 

ATGAAAACACGTATTACAGAATTACTTAATATTGATT^ 
CTGATGGTGATTTAGCAGGTGCAGTTTCTAATC 
1 0 CAAAGAAGTCGTTAAAGCTAATATTGATCGTGTCAAAGCTATTACTGATAGACCTTTT^ 
ATGCTTTTATCTCCTTTTGCTGATGATATrc 

C^GGCGCAGGAAATCCAGGAAAGTATATGGAAAGACTGCACCAGGCGGGTATAATCGTTGTTC^ 

CCCAAGCGTTGCGCTAGCCAAAraTATGGAAAAGCTTGGGGTAGATGCTC 

GCTGGAGGACATATTGGCAAGTTAACGACTATGTCTTTAGTAAGACAAGTTGTTC 
1 5 CTGTCATTGCGGCAGGTGGTATAGCTGATGGTCATGGTGCAGCAGCAGCAT^ 

TGTTCAAATTGGAACTCGCTTTGTTGTTGCTAAAG 

TTAGCAGCAAAAGATATTGATACGGTGATTTCTGCGCAGGTTGTGGGCC^ 

ATAAATTGACCTCAGCTTACGCTAAAGCAGAAAAAGCATTTTTAATTGGTCAAAAAACAG 

TGAAGAAATGGGAGCAGGATCGCTTCGACACGCTGTTATTGAAGGCGATGTAGTC^ 
20 GCTGGCCAAATTGCAGGGCTTGTGAGAAAAGAAGtf^GCTC 

GTGCAGCTCGTGTTATTCAAAATGAAGCTAAGCGCTGGCAATCTGTTTCAATAGAAAAGTAG 

Preferred GAS 504 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

25 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 13; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 13, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS 504 proteins include variants (e.g. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 13. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 13. Other preferred fragments lack one or more amino acids 

30 (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids(e.g. 1,2,3,4,5,6,7,8,9, 10, 15,20,25 or more) from the N-terminus of SEQ ID NO: 13. 
Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of a 
cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(8) GAS 509 

35 GAS 509 corresponds to Ml GenBank accession numbers GI: 1 3622692 and GI: 1 5675496, to M3 
GenBank accession number GI: 21910899, to Ml 8 GenBank accession number GI: 19746544, and is 
also referred to as *Spyl618' (Ml), 'SpyM3 J363' (M3), l SpyM18_1627' (M18) and «cysM\ GAS 

■ 

509 has also been identified as a putative O-acetylserine lyase. Amino acid and polynucleotide 
sequences of GAS 509 of an Ml strain are set forth below: 

40 SEQ ID NO: 15 

MTKI YKTITELVGQTPI I KLNRLI PNEAADVYVKLEAFNPGSSVKDRIALSMI EAAEAEGLIS PGDVI IB 
PTSGNTGIGLAWVGAAKGYRVI I VMPETMSLERRQI IQAYGAELVLTPGAEGMKGAI AKAETLAIELGAW 
MPMQFNNPANPS I HEKTTAQEILEAFKE I S LDAFVSGVGTGGTLSGVSHVLKKAN PETV I YAVEABES AV 
LSGQEPGPHKIQGI SAGFI PNTTiDTKAYDQI IRVKSKDALETARLTGAKEGFLVGI SSGAALYAAI EVAK 
45 QLGKGKHVLTI LPDNGERYLSTELYDVPVI KTK 

SEQ ID NO: 16 

ATGACTAAAATTTACAAAACTATAACAGAATTAGTAGGTCAAAC 
TTCCAAACGAAGCTGCTGACGTTTATGTAAAATTAGAAG 
50 TATTGCTTTATCGATGATTGAAGCTGCTGAAGCTGAAGGTCTG 
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CCAACAAGTGGTAATACAGGTATTGGTCTTGCATGGGTAGGTGCTGCTAAAGGGT 
TTATCCCCGAAACTATGAGCTTGGAAAGACGGCAAATC^ 
ACCTGGAGCAGAAGGTATGAAAGGGGCTATTGCAAAAGCTGAAAC^ 
ATGCCTATGCAATTTAATAACCCTGCCAATCGAAGCATC 
5 AAGCTTTTAAGGAGATTTCTTTAGATGCATTCGT 

TTCACATGTCTTGAAAAAAGCTAACCCTGAAACTGTTATCTATC 

TTATCTGGTCAAGAGCCTGGACCACATAAAATTCAAGGTATATCAGCTGG^ 

ATACCAAAGCCTATGACCAAATTATCCGTGTTAAATCGAAAGATGCTTTAGAAACTGCT 
AGCTAAGGAAGGCTTCCTGGTTGGGATTTCTTCTGGAG^ 

10 CAGTTAGGAAAAGGCAAACATGTGTTAACTATTTTACCAGATAATGGCGAACGCTATTTA 
TCTATGATGTACCAGTAATTAAGACGAAATAA ' ~ : 

Preferred GAS 509 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

1 5 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 1 5; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 15, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These GAS 509 proteins include variants (e.g. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 15. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 15. Other preferred fragments lack one or more amino acids 

20 (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 25, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 15. For 
example, in one embodiment, the underlined amino acid sequence at the C-terminus of SEQ ID NO: 
1 5 is removed. Other fragments omit one or more domains of the protein (eg. omission of a signal 
peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

25 (9) GAS 366 

GAS 366 corresponds to Ml GenBank accession numbers GI: 13622612, GI: 15675424 and 
Gl:30315979, to M3 GenBank accession number GI: 21910712, to M18 GenBank accession number 
GI: 19746474, and is also referred to as 'Spyl525' (Ml), 'SpyM3_1176' (M3), *SpyM18_1542* 
(Ml 8) and l murD\ GAS 366 has also been identified as a UDP-N-acetylemuramoylalanine-D- 
30 glutamate ligase or a D-glutamic acid adding enzyme. Amino acid and polynucleotide sequences of 
GAS 366 of an Ml strain are set forth below: 

SEQ ID NO: 17 

MKVI SNFQNKKI LI LGLAKSGEAAA KLLTKIXaALVTVNDSKPFTONPAAOALLEEGI KVI CGSHPVELLD 
ENFEYMVKNPGI PYDNPMVKRALAKEI PI LTEVELAY FVS EAP 1 1 GI TGSNGKTTTTTMI ADVLNAGGQS 
35 ALLSGN I G YPAS KWQKAI AGDTLVMELS S FQLVGVNAFR PHI AVITNLMPTHLDYHGSFEDYVAAKWMI 
QAQMTBSDYLILNANQEISATLAKTTKATVI PFSTQKWDGAYLKDGILYFKEQAI IAATDLGVPGSHNI 
ENALATI AVAKLSGI ADDI IAQCLSHFGGVKHRLQR VGQI KDITFYNDSKSTNIIATQKALSGFDNSRLI 
LI AGGLDRGNEFDDLVPDLLGLKQM 1 1 USE S AERMKRAANK^ 
LSPANAS WDMYPNFBVRGDE FLATFDCLRGDA 

40 SEQ ID NO: 18 

ATGAAAGTGATAAGTAATTTTCAAAACAAAAAAATATTA^ 

CAGCAAAATTATTGACCAAACTTGGTGCTTTAGTGACTGTTAAT 

AGCGGCACAAGCCTTGTTGGAAGAGGGGATTAAGGTC&TTTG 

GAGAACTTTGAGTACATGGTTAAAAACCCTGGGATTCCTTATGATAATCCTATGGTTAAACGCGCC 
45 CAAAGGAAATTCCCATCTTGACTGAAGTAGAATTGGCTTATTTCGT 

TACAGGATCAAACGGGAAGAC^ACCACAACGACAATGATTGCCGATGTTTTGAATGCTG^ 
GCACTCTTATCTGKSAAACATTGGTTATCCTGCTT^ 

TGGTGATGGAATTGTCCTCTTTTCAATTAGTGGGAGTGAATGCTTOT 
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TAATTTAATGCCGACTCACCTGGACTATCATGGCAGTTTTC^ 
CAAGCTCAGATGAGAGAATCAGACTACCTTATTTTAAA 
AGACCACCAAAGCAACAGTGATTCCTTTTTCAACTCAAAAAGTGGl^ 
AATACTCTATTTTAAAGAACAGGCGATTATAGCTGCAACIX1ACTTAGGTC 
5 GAAAATGCCCTAGCAACTATTGCAGTTGCCAAGTTATCt^Xs 
TTTCACATTTTGGAGGCGTTAAACATCGTTTC 

TGACAGTAAGTCAACCAATATTTTAGCCACTCAAAAAGCTTC 
TTGATTGCTGGCGGTCTAGATCGTGGCAATGAATTT^ 
AGATGATTATTTTGGGAGAATCCGCAGAGCGTATGAAGCGAGCTC 
1 0 TGAAGCTAGAAATGTGGCAGAAGCAACAGAGCTTGCTTTTAAGCTGGCCCA^ 
CTTAGCCCAGCCAATCCTAGCTGGGATATOTATC^ 
CCTTTGATTGTTTAAGAGGAGATGCCTAA 

Preferred GAS 366 proteins for use with the invention comprise an amino acid sequence: (a) having 
15 50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 17; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 17, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS 366 proteins include variants (eg. allelic 
variants, homology orthologs, paralogs, mutants, etc.) of SEQ ID NO: 17. Preferred fragments of (b) 
20 comprise an epitope from SEQ ID NO: 17. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 1 0, 1 5, 20, 25 or more) from the N-terminus of SEQ ID NO: 1 7. For 
example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 
1 7 is removed. Other fragments omit one or more domains of the protein (eg. omission of a signal 
25 peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(10) GAS 159 

GAS 159 corresponds to Ml GenBank accession numbers GI:13622244 and Gl.15675088, to M3 
GenBank accession number GI: 2 19 1 0303, to Ml 8 GenBank accession number GI: 1 9746056, and is 
also referred to as 'Spyl 105' (Ml), 'SpyMajW?' (M3), *SpyM18 J067' (M18) and *potD\ GAS 
30 159 has also been identified as a putative spermidine/putrescine ABC transporter (a periplasmic 
transport protein). Amino acid and polynucleotide sequences of GAS 159 of an Ml strain are set 
forth below: 

SEQ ID NO: 19 

MRKLYSFIiAGVLGVIVILTSLSFI I^QKKSGSGSQSDKLV I YNWGDY I DPALLKKFTKETGI EVQYETFDS 
35 NEAMYTKI KQGGTTYDI AVPSDYTIDKMI KENLI^KLDKSKLVGMDNIGKEFLGKSFDPQNDYSLPYFWG 
TVGIVYNDQLVDKAPMHWEDLWRPEYKNSIMLIDGA^ 
PWKAIVADBMXGYMI<^DAAIGITFSGEASEM^^ 

FLNFINRPENAAQNAAYIGYATPNKKAKALLPDEl KNDPAFYPTDDI I KKLEVYDNLGSR WI/5I YNDLYL 
QFKMYRK 

40 

SEQ ID NO: 20 

ATOCGTAAACT 

TCTTGCAGAAAAAATCGGGTTCTGGTAGTCAATCGGATA 
TGATCCAGCTTTGCTCAAAAAATTCACCAAAGA 

45 AATGAAGCCATGTACACTAAAATCAAGCAGGGCGGAACCACTTACGACATTGCTGTTCCTAGTC 

CCATTGATAAAATGATCAAAGAAAACCTACTCAATAAGCTTGATAAGTCAAAATTAGTTGGCATGGATAA 

TATCGGGAAAGAATTTTTAGGGAAAAGCTTTGACC 

ACCGTTGGGATTGTTTATAATGATCAATTAGTTGATAAG 

CAGAATATAAAAATAGTATTATGCTGATTGATGGAGCGCGTGAAATGCTAGGGGTTGGTTTAACAACTTT 
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tggttatagtgtgaattctaaaaatctagagcagttccaggcagcc 
ccgaatgttaaagccattgtagcagatgagatgaaaggctacatgattca^ 

TTACCTTTTCTGGTGAAGCCAGTGAGATG^ 

AGGGTCTAACCTTTGGTTTGATAATTTGGTACTACCAAAAACCATGA^ 
5 TTTTTGAACTTTATCAATCGTCCTGAAAATGCTGCGCAAAATG 

ATAAAAAAGCC^GGCCTTACTTCCAGATGAGATAAAAAATGATCCTGCTTTTT 
TATCAAAAAATTGGAAGTTTATGACAATTTAGGGT 

CAATTTAAAATGTATCGCAAATAA " 

10 Preferred GAS 159 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 19; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 19, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS 159 proteins include variants (eg. allelic 

15 variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 19. Preferred fragments of (b) 
comprise an epitope from SEQ ED NO: 19. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 19. For 
example, in one embodiment, the underlined amino acid sequence at the N-tenninus of SEQ ID NO. 

20 1 9 is removed. In another example, the underlined amino acid sequence at the C-terminus of SEQ ID 
NO: 19 is removed. Other fragments omit one or more domains of the protein (eg, omission of a 
signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(11) GAS 217 

GAS 217 corresponds to Ml GenBank accession numbers GU3622089 and GI-.15674945, to M3 
25 GenBank accession number GI: 21910174, to M18 GenBank accession number GI: 19745987, and is 
also referred to as 4 Spy0925' (Ml), 'SpyM3 J)638' (M3), and 'SpyM18_0982* (M18). GAS 217 has 
also been identified as a putative oxidoreductase. Amino acid and polynucleotide sequences of GAS 
217 of an Ml strain are set forth below: 

SEQ ID NO: 21 

30 MAQRIIVITGASGGXAQAIVKQLPKEDSLILLGRNKERLEHCYQHIDNKECLELDITNPVAIEKMVAQIY 
QRYGRI DVLINNAGYC^FKGFEEFS AQEI ADMFQVOT 

SAKSS I YS ATKFALI GFSNAX*RLELADKGVYVTTVN PGP I ATKF FDQADPSGHYLESVGKFTLQPNQVAK 
RLVSIIGKNKRELNLPFSLAVTHQFYTLFPKLSDYLARKVFNYK 

35 SEQ ID NO: 22 

ATGGCACAAAGAATCATTGTTATCACGGGAGOT 

CCAAGGAAGACAGCTTGATTTTATTAGGACGT^ 

CAACAAAGAATGCCTCGAGTTGGATATTACCAATCCAGTAGC 

40 CAGCGCTATGGCCGTATTGATGTCTTGATTAATAATGCTGGCT 

TTTCTGCCCAAGAAATAGCTGATATGTTTCAGGTTAACACCCTAGCGAGCATTCACTTTGCTTG^ 
TGGTCAGAAAATGGCAGAGCAGGGGCAAGGTCACCTTATTAAT 

TCAGCCAAATCGAGCATTTATTCAGCCACCAAGTTTGCCCTTATCGGATTTTCCAA 

AATTAGCGGATAAAGGGGTTTACGTGACGACCGTGAATCCAGGTCC^ 

45 AGCTGACCCGTCTGGACATTATTTGGAAAGCGTTGGTAAATT^ 

CGTTTGGTTTCTATTATCGGGAAAAATAAACGAGAATTGAAT 

AATTTTACACCCTTTTCCCTAAATTATCTGATTATCTTCCAAGAAAGGTATTTAATTATAAATGA 
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Preferred GAS 217 proteins for use with the invention comprise an amino acid sequence: (a) having 
• 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 9 1%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 21; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 21, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
5 30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These GAS 217 proteins include variants (e.g. allelic 
variants, homology orthologs, paralogs, mutants, etc.) of SEQ ID NO: 21. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 21. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-tenninus and/or one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 21. 
1 0 Other fragments omit one or more domains of the protein (e.g. omission of a signal peptide, of a 
cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(12) GAS 309 

GAS 309 corresponds to Ml GenBank accession numbers GI: 13621426 and GI: 15674341, to M3 
GenBank accession number GI: 21909633, to M18 GenBank accession number GI: 19745363, and is 
15 also referred to as 'Spy0124' (Ml), t SpyM3.0097 > (M3), *SpyM18_0205' (M18), W and 'ro£A\ 
GAS 309 has also been identified as a regulatory protein and a negative transcriptional regulator. 
Amino acid and polynucleotide sequences of GAS 309 of an Ml strain are set forth below: 

SEQ ID NO: 23 

MIBKYLESSIESKCQLIVLFFKTSYLPITEVAECTG 
20 THPFKETYLYQLYASSNVLQLU^^ 

KIVGBEYRIRYLIALLYSKFGIKVYDLTQQDKNTIHSFLSHSSTHLKTSPWLSESFSFYDILIiALSWKRH 
QFSVTI PQTRI FQQLKKLFVYDSLKKSSHDI I ETYCQ^ 

QYCQLFEENDTFRLLLNPIITLLPNLKEQKASLVKALM^ 

TSLKLIVEEWMAKLPGKRDLNHKHFHLFOT 

25 I DFHS YYLLQDNVYQIPDLKPDLVI THSQLI PFVHHBLTKGI AVAEISFDESILSIQELMYQVKEEKFQA 
DLTKQLT 

SEQ ID NO: 24 

TTGATAGAAAAATACTTGGAATCATCAATCGAAT 

30 CTTATTTGCCAATAACTGAGGTAGCAGAAAAAACTGGCTTAACCTTTTTACAACTAAACCATTATTGTGA 
GGAACTGAA^ 

AGACATCCTTTTAAAGAAACTTATCTTTACCAA 

TTTTAATAAAAAATGGTTCCCACTCTCGTCCCOT 

CTCAGCTTATCGGATGCGCGAAGCATTGATTCCTTTAT^ 

35 AAGATTGTCGGTGAGGAATATCGCATCCGTTACCTCATCGCTCTGCTATATAGTAAGTTTGGCATTAAAG 
TTTATGACTTCACGCAGCAAGACAAAAACACT^ 

AACCTCTCCTTGGTTATCGGAATCGTTTTCTTTCT 
CAATTTTCGGTAACTATTCCCCAAACC^ 

TGAAAAAAAGTAGCCATGATATTATCGAAACTTACTGCCAACTAAA 

40 CCTCTATTTAATTTATATCACCGCTAATAATTCTTTTGCGAGCnTACAATGGACACCTGAGCA 
CMTATTGTCAACTTTTTGAAGAAAATGATACTTTTCGCOT 

CTAACCTAAAAGAGCAAAAGGCTAOTTTAGTAAAAGCTCTTATGTTTTTTTCAA 

TCTGCAACATTTTATTCCTGAGACCAACTTATTCGTTTCT 

ACGTCCTTAAAGTTAATTC^ 

45 ATTTTCATCTTTTTTGCCACTATGTCGAGCAAAGTCTAAGAAATATCC^ 
CGTAGCCAGTAATTTTATCAATGCTCATCTCCTAACGGA 

ATTGATTTTCATTCCTATTATCTATTGCAAGATAATGT1TATCAM 
TCATC^TCACAGTCAACTGATTCCT 
ATCTTTTGATGAATCGATTCTGTCTA^ 
50 GATTTAACCAAGCAATTAACATAA 
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Preferred GAS 309 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 23; and/or (b) which is a fragment of at least n 
5 consecutive amino acids of SEQ ID NO: 23, wherein n is 7 or more (e.g. 8, 1 0, 1 2, 1 4, 1 6, 1 8, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These GAS 309 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 23. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 23. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,. 15, 20, 25 or more) from the C-terminus and/or one or more amino 
10 acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID NO: 23. 
Other fragments omit one or more domains of the protein (eg, omission of a signal peptide, of a 
cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(13) GAS 372 

GAS 372 corresponds to Ml GenBank accession numbers GI:13622698 and GI.15675501, to M3 
15 GenBank accession number GI: 21910905, to M18 GenBank accession number GI: 19746500 and is 
also referred to as c Spyl625' (Ml), 'SpyM3J369> (M3), and 4 SpyM18J634' (M18). GAS 372 has 
also been identified as a putative protein kinase or a putative eukaryotic-type serine/threonine kinase. 
Amino acid and polynucleotide sequences of GAS 372 of an Ml strain are set forth below: 

SEQ ID NO: 25 

20 M I QI GKLFAGRYR I LKS I GRGGMADVYLANDLI LDNEDVAI KVLRTNYQTDQVAVARFQRBARAMAELNH 
PNI VAIRDI GEEDGQQPLVMEYVDGADLKRYI QNHAPLSNNBVVRI ^EVLSAMTLAHQKGI VHRDLKPQ 
NILLTKEGWKVTDFGIAVAFAETSLTQTNSMLGSVHYLSPEQA^ 

PYDGDSAVTIALQHFQKPLPSIIEENHNVPQALEKWIRATAKKLSDRYGSTFEMSRDLMTALSYNRSRE 
RK1IFENVESTKPLPKVASGPTASVKLSPPTPTVLTQESRLDQTO 

25 FSF F I VGVALFTYLI LTKPTS VKVPNVAGTSLKVAKQELYDVGLKVGKI RQI ESDTVAEGNWRTDPKAG 

TAKRQGSSITLYVSIGNKGFDMENYKGLDYQEAmSLIETYGVPKSKIKIERIVTNEYPENTVISQSPSA 

GDKFNPNGKSKITLSVAVSDTITMPMVTEYSYADAVNTLTALGIDASRIKAYVPSSSSATGFVPIHSPSS 

KAIVSGQSPYYGTSLSLSDKGEISLYLYPEETHSSSSSSSSTSSSNSSSINDSTAPGSNTELSPSETTSQ 
TP 

30 

SEQ ED NO: 26 

ATGATTCAGATTGGCAAATTATTTGCTCGT^ 

CGGATGTTTATTTAGCAAATGACTTGATCTTGGATAATG^GACGTTGCAATCAAGGTCTTGCGTACCAA 
TTATCAAACAGATCAGGTAGCAGTTGCGCGTTTCCAACGAGAAGC^ 

35 CCCAATATTGTTGCCATCCGGGATATAGGTGAAGAAGACGGACAGCAAT^ 
ATGGTGCTGACCTAAAGAGATACATTCAAAATCATGCT^ 

GGAAGAAGTCCTTTCTGCTATGACTTTAGCCCACCAAAAAGGAATTGTACACAGAGATTTAAAACCTCAA 
AATATCCTACTAACTAAGGAGGGTGTTGTCAAAGTAACTGATTTCGGCATCGCAGTAGCCTTTGCAGAAA 
CAAGCTTGACACAAACTAATTCGATGTTAGGCAGTGTTCATTACTTC 
40 CAAAGCGACGATTCAAAGTGATATTTATGCX^TGGGGATTATG 
CCTTATGACGGCGATAGTGCTGTTACGATTGCCTTC 
AGGAGAACGACAATGTGCCACAAGCTTTGGAGAATGTTGTTATTCGAGCAA 
TCGTTACGGGTCAACCTTTGAAATGACTCGTGACTTAATGACGGCGCTTAGT^ 

CGTAAGATTATCTTTGAGAATGTTGAAAGTACCAAACCCCTCCCCAAAGTGGCCTCAGGTCCCACCGCT 
45 CTGTPJiAATTG^^ 

AGATGCTTTACAGCCCCCCACGAAAAAGAAAAAAAGTGGTCGTT 

TTTTCTTTCTTTATTGTAGGTGTAGCACTCTTTACTTATCTTATACTAACT 
TTCCTAATGTAGCAGGCACTAGTCTTAAAGTTGCCAAAGAAG^ 

TAAAATCAGGCAAATTGAGAGTGATACGGTTGCTGAGGGAAATGTAGTTA 
50 ACAGCTAAGAGGCAAGGCTCAAGCATTACGCTTTATGTGTCAATTGGAAACAAAGGT^ 
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ACTACAAAGGACTAGATTATCAAGAAGCTATGAATAGTTTGATAGAAA 

AATCAAAATTGAGCGC^TTGTAACTAATGAATATCCTGAAAATACAGT 

GGTGATAAATTTAATCCAAACXK5AAAGTCTAAAATTACXSCTCAGTGTTGCTC 

TGCCTATGGTAACAGAATATAGTTATGCAGATGCAGTCAATACCTTAA^ 

5 TAGAATAAAAGCTTATGTGCCAAGCTCTAGCTCAGCAACGGGCTrTC 

AAAGCTATTGTCAGTGGTCAATCTCCTTACTATGGAACGTCTTTGACT 

GTOTTTACCTTTATCC^GAAGAAACACACTCTTCTAGTAGCTCATC 

TTCHTCAATAAATGATAGTACTGCACCAGGTAGCAACACTCAATTAAGCCCAT 
ACACCTTAA 

10 

Preferred GAS 372 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 25; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 25, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

15 30, 35, 40, 50, 60, 70, 80, 90, 1 00, 1 50, 200, 250 or more). These GAS 372 proteins include variants 
(eg. allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 25. Preferred 
fragments of (b) comprise an epitope from SEQ ID NO: 25. Other preferred fragments lack one or 
more amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 1 0, 1 5, 20, 25 or more) from the C-terminus and/or one 
or more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ 

20 ID NO: 25. Other fragments omit one or more domains of the protein (eg. omission of a signal 
peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(14) GAS 039 

GAS 039 corresponds to Ml GenBank accession numbers GI:13621542 and GI:15674446, to M3 
GenBank accession number GI: 21909730, to M18 GenBank accession number GI: 19745398 and is 
25 also referred to as 'Spy0266' (Ml), 'SpyM3_0194' (M3), and 'SpyMlSJttSO' (M18). Amino acid 
and polynucleotide sequences of GAS 039 of an Ml strain are set forth below: 

SEQ ID NO: 27 

MDLILFLLVLVLLGLGAYLLFKVNGLQHQIJVQT^ 
30 LYQQLTDIRDVLHRSLSDSRDRSDKRLEKINQQ 

SFDSVSKQLESVNKGLGEMRSVAQDVGTLNKVLSNTKTRGI LGELQLGQI IEDIMTSSQYEREFVTVSGS 
SERVEYAIKLPGNGQGGYIYLPIDSKFPLEDYYRLEDAYEVGDKLAIEASRKALIJ^IEGIFAKDIHKKYL 
NPPETTNFGVMFLPTEGLYSEVVRNASFFDSLRREENIVVAGPSTLSALliNSLSVGFKTLNIQKNADDIS 

KI 1/3NVKLEFDKFGGLLAKAQKQMNTANNTLDQLI STRTNAI VRALNTVETYQDQATKSLLNMPLLEEEN 
35 NBN 

SEQ ID NO: 28 

ATGGACCTTATCnTGTTCCTTTTGGTCTTGGTTCTC 
ACGGCCTTCAACATCAGCTTGCCCAAACC^ 
40 CCAGTTGGATACAGCTAACAAACAAGAATTGTTAGAGCT 
CTTTACCAACAATTAACAGATATTCGTG&CG^ 

ACAAACGCTTAGAAAAAATTAACCAGCAGGTCAACGAATCGCTCAAAAATATGCAAGAATCTAACGAA^ 

ACGTTTGGAGAAAATGCGCCAGATCGTTGAAGAAAAATTGGAAGAAACCTTAAAAAATCGTCTGC 

TCTTTCGATTCTGTATCCAAGCAACTAGAAAGTGTCAATAAAGGCTTGGGAGAAATGCGTAGCGTGGCTC 
45 AAGATGTGGGTACTTTAAATAAGGTTTTGTCCAATACCAA 

AGGCCAAATCATTGAGGATATCATGACATCAAGCCAGTACGAAAGAGAATTTGT 

AGTGAACGCGTAGAATATGCGATTAAGCTCCCAGGAAATGGTCAAGGCGGTTATATTTACCTACCGATTG 
ACTCAAAATTCCCTCTTGAAGATTATTACCGATTAGAAGAT 

CGAGGCTAGCCGAAAAGCACTTCTGGCAGCTATCAAACGCTTTC 

50 AACCCCCCAGAGACGACCAATTTCGGAGTTATGTTCTTACCAACAGAAGGTCTTTATT 

GAAATGCGTCTTTCTTTGATAGCCTTCGTCGGGAAGAAAATATTGTGGTTGCAGGCCCTTCGACCCT 
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TGCTTTGCTGAATTCCTTATCTGTTGGTTTCAAGACCOT 
AAAATTTTAGGCAATGTCAAGTTAGAATTCGATAAATTTGGCGGCCTC 
TGAATACAGCTAATAATACGCTGGATCAGCTCATTTC^ 
TACCGTTGAAACTTATCAAGACCAAGCAACAAAATCTCTCTTGAACATG 
5 AATGAAAATTAA 

Preferred GAS 039 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 27; and/or (b) which is a fragment of at least n 

10 consecutive amino acids of SEQ ID NO: 27, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, or more). These GAS 039 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 27. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 27. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 

1 5 amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 

NO: 27. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(15) GAS 042 

GAS 042 corresponds to Ml GenBank accession numbers GI:13621559 and GI:15674461, to M3 
20 GenBank accession number GI: 21909745, to M18 GenBank accession number GI: 19745415, and is 
also referred to as *Spy0287' (Ml), 4 SpyM3_0209' (M3), and 'SpyMlSJ^' (M18). Amino acid 
and polynucleotide sequences of GAS 042 of an Ml strain are set forth below: 

■ 

SEQ ID NO: 29 

MTKEKLVAFSQAHARPAWLQEWILAALEAIPNLE 
25 NPKLVQVGTQTVLEQLPMALIDKGWFSDFYTALEEI PEVI EAHFGQALAFDEDKLiAAYHTAYFNSAAVL 
YVPDHLEITTPI EAI FLQDSDSDVPFNKHVLVI AGKESKFTYLERFES IGNATQKISANI SVEVI AQAGS 
QI KFSAIDRW3PSVTTY I SRRGRLEKDANIDWALAVMNEGNVI ADFDSDLIGQGSQADLKWAASSGRQV 
QG I DTRVTNYGORTVGH I LQHGV I LERGTLT FNG I GH I LKD AKGADAQQE SRVLMLSDQARAD AN P I LL I 
DENEVTAGHAAS I GQVDPEDMYYLMSRGLDQETAER1»VI RGFLGAVI AEI PI PSVRQEI I KVLDEKLLNR 

30 

SEQ ID NO: 30 

ATGACAAAAGAAAAACTAGTGGCTTTTTCGCAAGCCCACGCTGAGCCT 
TAGCXXSCATTAGAAGCCATTCC^UVATTTGGAATTACCAACCATCGAAAGGG 
TCTAGGAGATGGTACCTTAACAGAAAATGAAAGTCTAGCTAGTC^ 
35 AACCCAAAGCTTGTTCAGGTAGGCACGCAAAGAGTCTTAGAAC^GTTACCAATGGCG 
GAGTTGTTTTCAGTGATTTTTATACGGCGCTTGAGGAAATCCCAGAAG 
GGCATTAGCTTTTGATGAAGAGAAACTAGCTGCCTACC^ 

TACGTTCCTGATCACTTGGAAATCACAACTCCTATTGAAGCTATT 
TTCCTTTTAACAAGCATGTTCTAGTGATTGCAGGAAAAGAAACT 

40 ATCTATTGGCAATGCCACTCAAAAGATCAGCGCTAATATCAGTGTAGAAGTGATTGCTCAAGCAGG 
CAGATTAAATTCTCGGCTATCGACCGCTTAGGTCCTTCAGTGACAACCTATATTAGCCCT 
TAGAGAAGGATGCCAACATTGATTGGGCCTTAGCTGTGATGAATGAAGGCAATGTCATTGCTGATTTTGA 
CAGTGATTTGATTGGTCAGGGCTC^CAAGCTGATTTGAAAGTTGTTGCAGCCTCAAGTGG 
CAAGGTATTGACACGCGCGTGACCAACTATGGTCAACGTACGGTCGGTCATATTTTACAGCATGGTGTGA 

45 TTTTGGAACGTGGCACCTTAACGTTTAACGGGATTGGTCATATTCTAAAAGACGCT 
TCAACAAGAAAGCCGTGTTTTGATCCTTTCTGACCAAGCAAGAGCCGATC 
GATGAAAATGAAGTAACAGCAGGTCATGCAGCTTCTATCGGTCAGGTTGACCC^ 
TGATGAGTCGAGGACTGGATCAAGAAACAGCAGAACGATTGGTT^ 

CGCTGAAATTCCTATTCCATCAGTCCGCCAAGAGATTATTAAGGTTTTAGATGAGAAATT 
50 TAA 
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Preferred GAS 042 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 29; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 29, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 

5 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, or more). These GAS 042 proteins include variants (eg. 
allelic variants, homology orthologs, paralogs, mutants, etc.) of SEQ ID NO: 29. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 29. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 

10 NO: 29. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(16) GAS 058 

GAS 058 corresponds to Ml GenBank accession numbers GU3621663 and GI:15674556, to M3 
GenBank accession number OI: 21909841, to M18 GenBank accession number GI: 19745567 and is 
15 also referred to as 'Spy^O' (Ml), c SpyM3_0305* (M3), and 'SpyM18J)477' (M18). Amino acid 
and polynucleotide sequences of GAS 058 of an Ml strain are set forth below: 

SEQ ID NO: 31 

MKWSGFMCTKSKRFLNLATLCIAL^ 

GYLEGYE KGLKGDD I PERPKI QVPEDVQPSDHGDYRDGYEEGFGEGQHKRDPLETEAEDDSQGGRQEGRQ 
20 GHQEGADSSDLNVEESDGLSVI DE WGVI YQAFST I WTYLSGLF 

SEQ ID NO: 32 

ATGAAATGGAGTGGTTTTATGAAAACAAAAT 
TACTAGGAACAACTTTGCTAATGGCA CATCCCGTA^ 
25 TCGCTTCGGGTTAGGCGATTTAGAAGATGATTCAGCTAA 

GGATATTTAGAGGGATATGAAAAAGGCTTAAAAGGAGATGA^ 
CTGACK^TGTTCAGCCATCTGACCATGGCGACTATAGAG 

ACATAAACGTGATCCATTAGAAACAGAAGCAGAAGATGATTCTCAAGGAGGACGTCAAGAAGGACGTCAA 
GGACATCAAGAAGGAGCAGATTCTAGTGATTTGAACGTTGAAGAAAGCGACGGTTTGTCTGTTATTGATG 
30 AAGTAGTTGGAGTAATTTATCAAGCATTTAGTACT^ 

Preferred GAS 058 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 31; and/or (b) which is a fragment of at least n 

35 consecutive amino acids of SEQ ID NO: 31, wherein n is 7 or more (e.g 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, or more). These GAS 058 proteins include variants (e.g. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 31. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 31. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 

40 amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-tenninus of SEQ ID 
NO: 3 1 . For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
SEQ ID NO: 3 1 is removed. Other fragments omit one or more domains of the protein {eg. omission 
of a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular 
domain). 
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(17) GAS 290 

GAS 290 corresponds to Ml GenBank accession numbers GI:13622978 and GI:15675757, to M3 
GenBank accession number GI: 2191 1221, to M18 GenBank accession number GI: 19746869 and is 
also referred to as 'Spyl959* (Ml), 4 SpyM3 J 685* (M3), and 'SpyM18_2026' (M18). Amino acid 
5 and polynucleotide sequences of GAS 290 of an Ml strain are set forth below: 

SEQ ID NO: 33 

MKHILFIVGSLREGSFIWQIAAQAQKAI^ 
WIFTPVYNFSIPGSVKNLLDWLSRALDLSDPTC 
AGEFTKATVNPDAWGTGRLEI SKETKANLLSQAEALLAAI 

10 

SEQ ID NO: 34 

ATGAAACATATTTTATTTATTGTTGGCTCGCTTCGTGAAGGGTCTTn 
CACAAAAAGCTCTGGAACATCAAGCAGTTGTATCTTACT 

AGATATCGAAGCTAATGCACCTTTACC^GTTGTTGACGCTCGTCAAGCTGTTCAGTCAGCGGATGCTATC 
1 5 TGGATTTTTACACCAGTTTACAACTTCTCTATTCCAGGTTCTGTTAAAAACCTGCTAGAC 
GTGCTCTTGATTTGTCTGATCCGACGGGCCCATCTC 

TGGAAATGGCGGGCATGATCAAGTATTTGATCAGTTTAAAGCACTATTGCCGTTTATCCGAAC 

GCAGGAGAGTTTACAAAAGCAACTGTGAATCCTGATGCCTGGGG^ 

AGACAAAAGCAAACTTGCTATCTCAGGCAGAGGCTCTTTTAGCGGCTATTTAG 

20 

Preferred GAS 290 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 33; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 33, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

25 30, 35, 40, 50, 60, 70, 80, 90, 100 or more). These GAS 290 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 33. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 33. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from theN-terminus of SEQ ID NO: 33. 

30 Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of a 
cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(18) GAS 511 

GAS 51 1 corresponds to Ml GenBank accession numbers GI: 13622798 and GI: 15675592, to M3 
GenBank accession number GI: 2191 1053, to Ml 8 GenBank accession number GI: 19746700 and is 
35 also referred to as 'Spyl 743* (Ml), 'SpyMSJSH' (M3), 'SpyM18J815' (Ml 8) and 'accA\ Amino 
acid and polynucleotide sequences of GAS 511 of an Ml strain are set forth below: 

SEQ ID NO: 35 

MTDVSRILKEARDQGRLTTIJDYANLIFDDFMELH 
NLARNFGQPNPEGYRKALRI^KQAEKFGRPVVT^ 
40 AI I IGEGGSGGALALAVADQVWMLENTM YAVLS PEGFAS I LWKDGSRATEAAELMKITAGELYKMGIVDR 
1 1 PEHGYFSSEI VDI I KANLIEQITSLQAKPLDQLLDERYQRFRKY 

SEQ ID NO: 36 

ATGACAGATGTATCAAGAATTTTAAAAGAAGCGCGTGATCAAGGG 
45 ACCTTATTTTCGATGACTTTATGGAACTGCATG 

TGGCCTAGCTTATTTGGCGGGACAACCTGTTACGGTCAT^ 

AATTTGGCAAGGAATTTTGGCCAGCCCAATCCAGAAGGTTATCGTAAAGCTTTGCG 
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CAGAAAAATTTGGACGACCAGTTOT 

AGAACGAGGACAGGGTGAGGCCATTGCTAAAAATTTGATGGAAATG^ 
GCCATCATTATTXX»TGAAGGAGGCTCTG&TGGTGCATT^ 

TTGAAAATACTATGTATGCGGTTCTTAGCCCAGAAGGCTTTG 
5 GGCGACCGAGGCCGCTGAATTGATGAAAATCACAGCG 
ATTATTCCAGAACATGGTTATTTTTCAAGTGAAATC 

TAACCAGTTTGCAAGCTAAGCCATTAGACCAATTATTAGATGAGC^ 
A 

10 Preferred GAS 51 1 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQIDNO: 35; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 35, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100 or more). These GAS 511 proteins include variants (eg. allelic 

1 5 variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 35. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 35. Other preferred fragments lack one or more amino acids 
(e.g. 1,2, 3,4,5, 6, 7,8,9, 10, 15,20, 25 or more) from the C-tenninus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 35. 
Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of a 

20 cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(19) GAS 533 

GAS 533 corresponds to Ml GenBank accession numbers GI:13622912 and GI:15675696, to M3 
GenBank accession number GI: 2191 1 157, to M18 GenBank accession number GI: 19746804 and is 
also referred to as < Spyl877' (Ml), 4 SpyM3 J 621' (M3), 'SpyM18 J942' (M18) and 4 glnA\ GAS 
25 533 has also been identified as a putative glutamine synthetase. Amino acid and polynucleotide 
sequences of GAS 533 of an Ml strain are set forth below: 

SEQ CD NO: 37 

MAITVADIRREVKEKNWFLRLMFTDIM 

LYPDU)TWIVFPWGDENGAVAGLICDIYTAEGKPFAGDPRGNLKRALKH>INEIGYKSFNU3PEPEFFLFK 
30 MDDKGN PTLEVNDNGGYFDLAPI DLADNTRREI VN I LTKMGFE VEASHHEVAVGQHE I DF KYADVLKACD 
NIQIFKLWKTIAREHGLYATFMAKPKFGIAGSGMHOMSLFDN 
GLMKHAYNYTAITNPTVNSYKRLVPGYEAPVYVAWAGSNRSPL 

LA VLLEAGLDG 1 1 N K I EA PE P VEAN I YTMTMEE RNEAG 1 1 DLP S TLHN ALKALQKDD WQ KALGYHI YTN 
FLEAKRIEWSSYATFVSQWEIDHYIHNY 

35 

SEQ ID NO: 38 

ATGGCAATAACAGTAGCTGACATTCGTCOTGAAGTCAAAGAAA 

TCACTGATATCATGGGCGTTATGAAAAATGTGGAGATTCCTGCAACTAAAGAACAGTTAGACAAAGTATT 
GTCTAACAAGGTTATGTTTGATGGTTCATCTATCGAAGGT 

40 CTTTACCCCGATTTAGACACTTGGATTGTTTTTCCCTGGGGAGATGAAAATGGAGCAGTTGCAGG 
TTTGTGATATTTATACAGCAGAAGGAAAGCCTTTTGCA 

GAAACACATGAACGAGATCGGCTACAAATCATTTAATCTTGGACCAGAACCAGAAT^ 

ATGGATGATAAAGGTAATCCGACACTTGAAGTTAACGATAATGGTGGTTATTTTGATTTAGCGCCAATTC 

ACTTAGCAGACAACACGCGCCGTGAAATTGTGAATATTTTAACGAAAATGG^ 

45 TCATCATGAAGTGGCTGTTGGTCAACATGAGATTGATTTTAAATATGCAGATGTTTTGAAAGCTTGTGAT 
AATATTCAAATTTTTAAGCTAGTTGTAAAAACGATTGCCC^ 

CTAAACCAAAATTTGGAATAGCTGGATCAGGGATGCACTGTA^ 

TAATGCTTTTTATGATGAAGCTGATAAGCGAGGGATGCAGTTAT 

GGACTAATGAAGCATGCTTATAACTACACTGCTATCACTAACCCTACAGTGAATTCTTATAAACGATTAG 
50 TTCCAGGTTATGAGGCACCTGTTTATCTCGCTTGGG 
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AGCATCACGTGGTATGGGAACGCGTTTGGAGTTACGTTCGGTTGATCCGACAGCT 
TTGGCTGTTCTCTTGGAAGCTGGATTAGATGGT^ 

CTAACATTTATACCATGACAATGGAAGAACGAAATGAAGCAGGCATTATTGATTTC 
TAATGCCTTAAAAGCTCTTCAAAAAGATGATGTGGTACAAAAGGCACTAGGTT 
5 TTCTTAGAAGCAAAACGAATTGAATGGTCTTCCTATGCAACTTTTC 
ATATTCATAATTATTAG 

Preferred GAS 533 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

10 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 37; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 37, wherein n is 7 or more {e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 533 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 37. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 37. Other preferred fragments lack one or more amino 

15 acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 37. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(20) GAS 527 

20 GAS 527 corresponds to M 1 GenBank accession numbers GI: 1 3622332, GI: 1 5675 169, and 

GI:242 1 1764, to M3 GenBank accession number GI: 21910381, to Ml 8 GenBank accession number 
GI: 19746136, and is also referred to as 'Spyl204' (Ml), 'SpyM3J)845* (M3), *SpyM18 J 155' 
(Ml 8) and 'guaA\ GAS 527 has also been identified as a putative GMP synthetase (glutamate 
hydrolyzing) (glutamate amidotransferase). Amino acid and polynucleotide sequences of GAS 527 of 

25 an Ml strain are set forth below: 

SEQ ID NO: 39 

MTEISILNDVQKIIVLDYGSQYNQLIARRIREFGVFSELKSHKITAQELREINPIGIA/LSGGPNSVYADN 
AFGIDPEI FELGI PI LGICYGMQLITHKLGGKVVPAGQAGNREYGQSTLHLRETSKLFSGTPQEQLVLMS 
HGDAVTE I PEGFHLVGDSNDC PYAAI ENTEKNLYGIQFHPEVRHS VYGND I LKNFAI S I CGARGDWSMDN 
30 FIDME I AKI RETVGDRKVLLX3LSGGVDS S WGVLLQKAIGDQLTCI FVDHGLLRKDEGDQVMGMLGGKFG 
LNI IRVDASKRFLDLLADVEDPEKKRKI IGNEFVYVFD^ 
KSHHNVGGLPEDMQFELIEPLNTLFKDEVRAIX^^ 

ESDAI LREEI AKAGLDRDWQYFTVNTGVRSVGVMGDGRTYDYT I AI RAI TS I DGMTADFAQLPWDVLKK 
ISTRI VNB VDHVNR I VYD I TS KP PATVE WE 

35 

SEQ ID NO: 40 

ATGACTGAAATTTCAATTTTGAATGATGTTCAAAAAATTATC 

AGCTTATTGCTAGACGTATTCGAGAGTTTGGTGTTTTCTCCGAACTAAAAAGCCATAAAATCACCGCTCA 
AGAACTTCGTGAGATCAATCCCATAGGTATCGTTTT^ 

40 GCCTTTGGCATTGACCCTGAAATCTTTGAACTAGGGATTCCGATTCTTGGTATCTGTTAC 
TAATCACCCATAAATTAGGTGGTAAAGTTGTTCCK^TG 

AACCCTTCATCTTCGTGAAACGTCAAAATTATTTTCAGGCACACCTCAAGAACAACT 
CATGGTGATGCTGTTACTGAAATTCCAGAAGGTTTCCACCTTGTTGGAGACTCAAAT 
CAGCTATTGAAAATACTGAGAAAAACCTTTACGGTATTCAGTTCCACC 
45 TGGAAATGACATTCTTAAAAACTTTGCTATATCAATT 

TTTATTGACATGGAAATTGCTAAAATTCGTGAAACTGTAGGCGATCGTAAAGT^ 
GTGGAGTTGATTCTTCAGTTCTTGGTGTTCTACTTCAAAAAGC 
CGTTGATCACGGTCTTCTTCGTAAAGACGAGGGCGATCAAGTTATGGGAAT 
CTAAATATTATCCGTGTGGATGCTTC^AAACGTTTCTTAGACCTTCrTGCAGACG 
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AAAAACGTAAAATTATTGCTAATC 

TGACTTCCTTGCCCAAGGAACACnTTATACTGATATCATTGAGTCAQ 
AAATCACATCACAATGTGGGTGGTCTCCCCGAAGACATGCAGTTTGAA 
TTTTCAAAGATGAAGTTCGAGCGCTTGGAATCGCTCTTGGAATW 
5 ATTTCCAGGTCCTGGACTTGCTATCCGT^ 

GAATCAGACGCTATCCTTCGTGAAGAAATTGCTAAGGCTGGACTTGATCG 

CAGTTAACACAGGTGTCCGTTCTGTAGGCGTCATGGGAGATGGTCGTACTTATGATTATACCATCG 
TCGTGCTATTACGTCTATTGATGGTATGACJ^GCTGACTTTK 

ATCTCAACACGTATCGTAAATGAAGTTGACCACGTTAACCGTATCGTCTACGACATCACAA 
10 CCGCAACAGTTGAATGGGAATAA 

Preferred GAS 527 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 39; and/or (b) which is a fragment of at least n 

1 5 consecutive amino acids of SEQ ID NO: 39, wherein n is 7 or more (e.g. 8, 10, 12, 14, 1 6, 1 8, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 527 proteins include variants (e.g. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 39. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 39. Other preferred fragments lack one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 

20 amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terrainus of SEQ ID 

NO: 39. Other fragments omit one or more domains of the protein (e.g. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(21) GAS 294 

GAS 294 corresponds to Ml GenBank accession numbers GH3622306, GI:15675145, and 
25 GL26006773, to M3 GenBank accession number GI: 21910357, to M18 GenBank accession number 
GI: 197461 1 1 and is also referred to as 'Spyl 173' (Ml), *SpyM3J)82r (M3), 4 SpyM18J 125* 
(M18) and 4 gid\ GAS 294 has also been identified as a putative glucose-inhibited division protein. 
Amino acid and polynucleotide sequences of GAS 294 of an Ml strain are set forth below: 

SEQ ID NO: 41 

30 MSQSTATY I NVI GAGLAGSEAAYQ I AKRG I PVKLYEMRGVKATPQHKTTN FAELVCSNS FRGDSLTNAVG 
LLKEEMRRLDSI IMRNGEANRVPAGGAMAVDREGYAES VTAELENHPLIEVIRGEITEI PDDAITVIATG 
PLTSDALAEKI HALNGGDGFYFYDAAAPI IDKSTIDMSKVYLKSRYDKGEAAYIiKCPMTKEEFMAFHEAL 
TTAEEAPLNAFEKEKYFEGCMPI EVMAKRGI KTMLYGPMKPVGLEYPDDYTGPRDGEFKTPYAWQLRQD 
NAAGSL YN I VGFQTHLKWGEQKRVFQMI PGLENAEFVRYGVMHRNS YMDS PNIiLTETFQSRSN PNLFFAG 

35 QMTGVBGYVESAASGLVAGINAARLFKREEALIFPQTTAIGSLPHYVTHADSKHFQPMNVNFGI I KELEG 
PRIRDKKERYEAIASRALADLDTCLASL 

SEQ ID NO: 42 

TTGTCTCAATCAACTGCAACTTATATTAATGTTATTGGAGCT 

40 AGATTGCTAAGCGCGGTATCCCCGTTAAATTGTATGAAATGCGTGGTGTCAAAGCAACACCGCAACATAA 
AACCACTAATTTTGCCGAATTGGTCTGTTCCAACT 

CTTCTCAAAGAAGAAATGCGGCGATTAGACTCCATTATTATGCGTAATGGTGAAGCTAACCGCGT 
CTGGGGGAGCAATGGCTGTTGACCGTGAGGGGTATGCAGAGAGTGT 

TCTCATTGAGGTCATTCGTGGTGAAATTACAGAAATCCCTGACGATGCTATCACGGTTATCGCGA 
45 CCGCTGACTTCGGATGCCCTGGCAGAAAAAATTCACGCGC^ 

ATG^GCAGCGCCTATCATTGATAAATCTACCATTGATATGAGCAAGGT^ 

TAAAGGCGAAGCTGCTTACCTCAACTGCCCTATGACCAAAGAAGAAT^ 

ACAACCGCAGAAGAAGCCCCGCTGAATGCCTTTGAAAAAGAAAAGTATTTTGAAGGCTGTA 

AAGTTATGGCTAAACGTGGCATTAAAACCATGCTTTATGGACCTATG^ 
50 AGATGACTATACAGGTCCTCGCGATGGAGAATTTAAAACGCCATATGCCGTCGTGCAATTGCGTCAAGAT 
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AATGCAGCTGGAAGCCTTTATAATATCGTTGGTTTCCAAACCXIATCT 
TTTTCCAAATGATTCCAGGGCTTGAAAATGCTGAGTTTC 
TATGGATTCACCAAATCTTTTAACCX1AAACCTTCCAAT 
CAGATGACTGGAGTTGAAGGTTATGTCGAATCAGCTGCTTCAC^ 
5 GTTTGTTCAAAAGAGAAGAAGCACTTATTTTTCCTCAGACAACZAGCT 
GACTCATGCCGACAGTAAGCATTTCCAACCAATGAACGTC^ 

CCACGCATTCGTGACAAAAAAGAACGTTATGAAGCTATTGCTACT 
GCTTAGCGTCGCTTTAA 

■ 

1 0 Preferred GAS 294 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 41; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 41, wherein n is 7 or more {e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 294 proteins include variants (eg. 

15 allelic variants, homologs, orthologs, paralogs, mutants, eta) of SEQ ID NO: 41. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 41 . Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1,2,3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 4 1 . Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 

20 of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(22) GAS 253 

GAS 253 corresponds to Ml GenBank accession numbers GI:1362261 1, GI:15675423, and 
GI:21362716, to M3 GenBank accession number GI: 2191071 1, to M18 GenBank accession number 
GI: 19746473 and is also referred to as 'Spyl524* (Ml), *SpyM3J 175' (M3), 'SpyM18 J541 * 
25 (Ml 8) and 4 murG\ GAS 253 has also been identified as a putative undecaprenyl-PP-MurNAc- 

pentapeptide-UDPGlcNAc GlcNAc transferase. Amino acid and polynucleotide sequences of GAS 
253 of an Ml strain are set forth below: 

SEQ ID NO: 43 

MPKKI LFTGGGTVGHVTLNLI L I PKFI KDGWEVHYI GDXNGI BHTE I EKSGLD VTFHAIATGKLRR YFSW 
30 QNLAD VFKVALGLLQSL F I VAKLR PQALF S KGGFVS VPP VVAAKLLGKPVF I HE SDR SMGIANK I AY KF A 
TTMYTTFEQEDQLSKVKHLGAVTKVFKDANQMPESTQIiEAVKEYFSRDLKTLL 

H PELKQRYN I IN I TGD PHLNEL S SHLYR V13 YVTDL YQ PLMAMADLWTRGGSNTLPEIJjAMAKLHLI VPL 

GKEASRGDQLENATYFEKRGYAKQLQEPDLTLHNFDQAMA 

ADISSAIKEK 

35 

SEQ ID NO: 44 

ATGCCTAAGAAGATTTTATTTACAGGTGGTGGAACTGTAGGTCATGTCACCT 
CAAAATTTATCAAGGACGGTTGGGAAGTACATTATATTGCT 
TGAAAAGTCAGGCCTTGACGTGACCTTTCATGCT 
40 CAAAATCTAGCTGATGTTTTTAAGGTTGCACTTGG 
GCCCTCAAGCCCTTTTTTCCAAAGGTGGTTTTGT 

TAAACCAGTCTTTATTCATGAATCAGATCGGTCAATGGGACT^ 

ACTACCATGTATACCACTTTTGAGCAGGAAGACCAGTTGTCTAAAGTTAAACACCTTGGAGCGGTGACAA 
AGGTTTTCAAAGATGCCAACCAAATGCCTGAATCAACTCAGTT 
45 AGACCTAAAAACCCTCTTGTTTATTGGTGGTTCGGCAG^ 

CATC CAGAATTGAAGCAACGTTATAATATCATCAATATTACAGGAGACCCTCAC CTTAATGAATTGAGTT 

CTCATCTGTATCteAGTAGATTATGTTACCGATCTCTACCAACCTTT 

GACAAGAGGGGGCTCTAATACACITTTTGAGCTACTGGCAATGGCT 

GGTAAAGAAGCTAGCCGTGGCGATCAGTTAGAAAATGCCACTTATTTTGAGAAGAGGGGCTACGCT 
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AATTACAGGAACCTGATTTAACTTTGCATAATTITCATCAGGCAATGGC^ 

TGATTATGAGGCTACTATGTTGGCAACTAAGGAGATT^ 

GCTGATATTAGCTCCGCGATTAAGGAGAAGTAA 

5 Preferred GAS 253 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99:5% or more) to SEQ ID NO: 43; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 43, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 253 proteins include variants (eg. 

10 allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 43. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 43. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 43. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 

15 of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(23) GAS 529 

GAS 529 corresponds to Ml GenBank accession numbers GI:13622403, GI:15675233, and 
GI:21759132, to M3 GenBank accession number GI: 21910446, to M18 GenBank accession number 
GI: 19746203 and is also referred to as 4 Spyl280' (Ml), *SpyM3_0910' (M3), < SpyM18_1228 > 
20 (Ml 8) and 'glmS' . GAS 529 has also been identified as a putative L-glutamine-D-fructose-6- 
phosphate aminotransferase (Glucosamine-6-phophate synthase). Amino acid and polynucleotide 
sequences of GAS 529 of an Ml strain are set forth below: 

SEQ ID NO: 45 

MCGI VGWGNRNATDI LMQGLBKLEYRGYDSAGI FVANANQTNLIKSVGRI ADLRAKIGIDVAGSTGIGH 
25 TRWATHGQSTEDNAHPHTSQTGRFVLVHNGVIENYLHI KTEFLAGHDFKGQTDTE I AVHLIGKFVEEDKL 

SVLEAFKKSLSIIEGSYAFALMDSQATDTIWAKNKSPLLIGLGEGYNMVCSDAMAMIRETSEFT4EIHDK 
ELVILTKDKVTVTOYDGKELIRDSYTAELDLSDIGKGT 
DPAIITSIQEADRLYItAAGTSYHAGFATKNMLEQLTO^ 
TADSRQVLVKANAMGIPSLTVTNVFGSTLSREATYTMLIM 
30 NGKQEALDFNLVHELSLVAQS I EATLS EKDLVAEKVQALLATTRNAF YIGRG^?DYYVAMEAALKLKE I S Y 
IQCEGFAAGELKHGTI SLI EEDTPVI ALI SSSQLVASHTRGNI QEVAARGAHVLTWEEGLDREGDDI I V 
NKVHPFLAPI AMVI PTQLI AYYASLQRGLDVDKPRNLAKAVTVE 

SEQ ID NO: 46 

35 ATGTGTGGAATTGTTGGAGTTGTTGGAAATCGCAATGCAACG 
TTGAATACCGGGGTTATGATTCAGCAGGAATTTTTGTG 

AGTGGGGCGGATTGCTGATTTGCGTGCCAAGATTGGCATTGATGTTGCTGGTTCAA 

ACCCGTTGGGCAACGCATGGCCAATCAACAGAGGATAATGCCCATCCTCACACGTCACAAACTGGACG^ 
TTGTACTTGTTCATAATGGTGTGATTGAAAATTACCTT 

40 TTTTAAGGGGCAGACAGATACTGAGATTGCAGTACACTTGATTGGA 

TCAGTACTGGAAGCTTTTAAAAAATCTTTAAGCATTATTGA^ 

GCCAAGCAACI^TACTATTTATGTGGCTAAAAACAAGTCTCCATTC 

CAACATXX3TTTGTTCAGATGCCATGGCCATGATTCGTGAM 

GAGCTAGTTATTTTAACCAAAGATAAGGTAACTGTTACAGACT 
45 CCTACACTGCTGAATTAGACTTATCTGATATTGGCAAAGGGACTTATCCTTTC 

TGATGAGCAACCAACCGTAATGCGTCAATTAATTTCAACTTATGCAGATGAAA 

GATCCGGCTATCATTACCTCTATCCAAGAGGCTGACCGTCTTTATATTTT 

ATGCTGGTTTTGCAACAAAAAATATGCTTGAGCAATTGACAGAT^ 

TGAGTGGGGTTACCACATGCCTCTGCTTAGCAAGAAACCAATGTTTATTCTACTAAGCCAATC7V 
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ACCGCAGATAGTCGTCAAGTTTTAGTAAAGGCAAATGCTATGGGCATTCCGAGTT^ 

TTCCAGGATCAACCTTATCACGTGAAGCAACATACACCATGTTGATTCA 
TGCGTCTACAAAAGCTTACACTGCACAAAT^^ 

AATGGTAAGCAAGAAGCTCTTGACTTTAACTTGGTAC^ 

5 CGACTTTGTCTGAAAAAGATCTCGTGGCAGAAAAGGTTCAAGCTT^ 

TTACATCGGGCGTGGCAATH^TTATTACGTTGCGATGGAAGC^ 

ATTCAATGCGAAGGCTTTGCGGCTGGTGAATTGAAACATGGAACCAT^ 
CAGTAATCGCTTTAATATCGTCTAGTCAGTTGGTTGC^ 

TGCCCGTGGGGCTCATGTTTTAAGAGTTGTC^ • 
1 0 AATAAGGTTCATCCTTTCCTAGCCrc^ 

CATTACMCGTGGACTTGATGTTG&TAAGCK^ 

Preferred GAS 529 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

15 97%, 98%, 99%, 99.5% or more) to SEQ ED NO: 45; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 45, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 529 proteins include variants (e.g. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 45. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 45. Other preferred fragments lack one or more amino 

20 acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 45. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(24) GAS 045 

25 GAS 117 corresponds to M3 GenBank accession number OI: 21909751, M18 GenBank accesion 

number GI: 19745421 and is referred to as 'SpyM3J)215' (M3), 'SpyM^oppA 1 (M18) and <oppA\ 
GAS 045 has been identified as an oligopeptide permease. Amino acid and polynucleotide sequences 
of GAS 045 from an Ml strain are set forth below: 

SEQ ID NO; 47 

30 WFMKKSKWLAAVSVAI LS VSAIAA CGNKWASGG5EATKTYKYVF\mnP v.Q T ,nvT t .Tyjnq. 
GTTDVITQMVTJGLLENDEYGNLVPSLAKDWKVSKDGLTY 

TAEDFVTGLKHAVDDKSDALYWEDSI KNLKAYQNGEVDFKEVGVKALDDKXVQYTLNKP 

ESYWNSKTTYSVLFPVNAKFLKSKGKDFGTTDPSSILVNGAYFLSAPTSKSSMEFHKNEN 
YWDAKNVG I E SVKLTYSDGSDPGSF YKN FDKGE FSVARLY PNDPTYKS AKKNYADN I TYG 

35 MLTGDI ^LTWNIJ^TS FKNTKKDPAQQDAGKKALNNKDFRQAIQF AFDRASFQAQTAGQ 
DAKTKALRNMLVPPTFVTIGESDFGSEV^KEMAKLGDEWKDWLADAQDGFYNPEKA^ 
FAKAKEALTAEG\nTPVQLDYPVT>QANAATVQEAQS 

THEAQGF YAETPEQQDYDI I SSWWGPDYQDPRTYLDIMSPVGGGSVIQKLGIKAGQNKDV 
VAAAGLDTYQTLLDEAAAITDDNDARYKAYAKAQAYLTDNAVD I P WALGGTPR VTKAVP 
40 FSGGFSWAGSKGPLAYKGMKLQDKPVTVKQYEKAKEKWMK^ 

SEQ ID NO: 48 

GTGACTTTTATGAAGAAAAGTAAATGGTTGGCAGCTGTAA 

TCCGCTTTGGCAGCTTGTGGTMTAAAAATGCTTr AP^TnnrTP A n*zwv*r**mnn 
45 TACAAGTACGTTTTTGTTAACGATCCAAAATCA 

GGAACGACT^TGTGATAACACAAATGGTTGATGGTC 

AATTTAGTAC»TCACTTGCTAAAGATTGGAAGGTTTCAAAAGACGGTCT^ 
TATACTCTTCX3CGATGGTGTCTCTTGGTATACGGCTGATC 
ACAGCAGAAGATTTTGTGACTGGTTTGAAGCACGCGGTTGACGA 
50 TACGTTGTTGAAGATTCAATAAAAAACTTAAAGGCCT 

AAAGAAGTTGGTGTCAAAGCCCTTGACGATAAAACTGTTCAGTATACTTTGAACAAGCCT 
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GAAAGCTACTGGAATTCAAAAACAACTTATAGTG 

TTGAAGTCAAAAGGTAAAGATTTTGCTACAACCGATCCATCATCAATCCT^ 
GCTTACTTCTTGAGCGCCTTCACCTCAAAATCATCTATGGAATTC 
TACTGGGATGCTAAGAATGTTGGGATAGAATCTGTTAAATT6ACTTACTCAGA 
5 GACCCAGGTTCGTTCTACAAGAACTTTGACAAGGGTGAGTTC^ 

CCAAATGACCCTACCTACAAATCAGCTAAGAAAAACTATGCTGATAACATTACTTACGGA 
ATGTTGACTGGAGATATCCGTCATTTAACATGGAATTTGAAC 

ACT AAGAAAGAC CCTGCACAAC AAGATGC CGGTAAGAAAGCTCTTAACAACAAGGATTTT 
CGTCAAGCTATTCAGTTTGCTTTTGACCGAGCGTC^ 
1 0 GATGCCAAAACAAAAGCCTTACGTAACATGCTTGTCCCACCAACATTTGTC 

GAAAGTGATTTTGG TTCAGAAGTTGAAAAGGAAATGGCAAAACTTGGTGATGAATGGAAA 
GACGTTAACTTAGCTGATGCTCAAGATGGTTTCTATAATC 

tttgcaaaagccaaagaacxhttaacagctgaa^^ 
taccctgttgaccaagcaaacgcagcaactgttcaggaagcccagtctttc^ 

] 5 GTTGAAGC^TCTCTTGGTAAAGAGAATGTCATTGTCAATGTTCTTGAAACAGAAACAT^ 
ACTCACGAAGCCCAAGGCTTCTATGCTGAGACCCCAGAACAACAAGACTACGATATCATT 
TCATCATGGTGGGGACCAGACTATCAAGATCCACGGACCTACCTTCACATCATGAGTCCA 
GTAGGTGGTGGATCTGTTATCCAAAAACTTGGAATCAAAGCAGGTCAAAATAAGGATGTT 
GTGGCAGCTGCAGGCCTTGATACCTACCAAACTCTTCTTGATGAAGCA 

20 GACGACAACGATGCGCGCTATAAAGCTTACGCAAAAGCACAAGCCTACCTTACAGATAAT 
GCCGTAGATATTCCAGTTGTGGCATTGGGTGGCACTCCACGAGTTACTAAAGCCGTTCCA 
TTTAGCGGGGGCTTCTCTTGGGCAGGGTCTAAAGGTCCTCTAGCATATAAAGGAATGAAA 
CTT CAAGACAAAC CTGTCACAGTAAAACAATACX1AAAAAGCAAAAGAAAAATGGATGAAA 
GCAAAGGCTAAGTCAAATGCAAAATATGCTGAGAAGTTAGCTGATCACGTTC 

25 

Preferred GAS 045 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 47; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 47, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 

30 30, 35, 40, 50, 60, 70, 80, 90, 1 00, 1 50, 200 or more). These GAS 045 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 47. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 47. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-texminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 

35 NO: 47. For example, in one embodiment, the underlined amino acid sequence at the N-tenninus of 
SEQ ID NO: 47 is removed. Other fragments omit one or more domains of the protein (eg. omission 
of a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular 
domain). 

* 

(25) GAS 095 

40 GAS 095 corresponds to Ml GenBank accession numbers GI: 13622787 and GI:15675582, to M3 
GenBank accession number GI: 2191 1042, to M18 GenBank accession number GI: 19746634 and is 
also referred to as 'Spyl733' (Ml), 'SpyM3J5G6' (M3), 'SpyM18J741' (M18). GAS 095 has also 
been identified as a putative transcription regulator. Amino acid and polynucleotide sequences of 
GAS 095 of an Ml strain are set forth below: 

45 SEQ ID NO: 49 

M KIGKKIVLMFTAIVLTTVLAIXjVYL^ 
SSERASKWEGNSDSMILVTVNPKTKKTTMTSL^ 

VQDLLNITIDNWQI^QGLIDLWAVGGITV^BFDFPISIAENEPEYQATVAPGTHKINGBQALW^ 
MRYDDPEGDYGRQKRQREVIQKVXKKIIJUiDS 
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I KTYQLKGEDATLSDGGS YQI VTSKHLLE I QNRI RTEIX3LHKVNQLKTNATVYENLYGSTKSQTVNNNYD 

SSGQAPSYSDSHSSYANYSSGVl>TGQSASTDQDSTASSra 

NPQT 

5 SEQ ID NO: 50 

ATGAAAATTGGAAAAAAAATAGTTTTAATGTTCACAGCTATTGTC 
TCTATCTAACTAGTGCTTATACCTTCTCAA CAGGAGAATTATCAAAGACCTTT 
TTCAAACAAAAGTGATGCCATTAAACAAACAAGAGCTTTTTCTATOT 
TCTTCAGAGCGTGCCTCCAAGTGGGAAGGAAACAGTGATTCGATGATTT^ 
1 0 CCAAGAAAACAACTATGACTAGTTTAGAACGAGATACCTTAACCACX3TO 
AATGAATGGTGTTGAAGCTAAGCTTAACGCTGCTTATGCAGCAGGTG 

GTGCAAGATC TTTTGAATATC ACCATTGATAAC T ATGTTOVAATTAATATGCAAGGCCTTATTGATCTTG 
TGAATGCAGTTGGAGGGATTACAGTTACAAATGAGTTTG 

TGAATATCAAGCTACTGTTGCGCCTGGAACACACAAAATTAACGGTGAACAAGCTTT^ 
1 5 ATGCGTTATGATGATCCTGAGGGAGATTATGGTCGACAAAAGCGTCAACGTGAAGT^ 
TGAAAAAAATCCTTGCTCTTGATAGCATTAGCTCTTATCXX^ 

GCAAACGAATATCGAAATCTCTTCTCGCACTATCCCTAGTCTATTAGGTTATCGTGACGCACTTA 
ATTAAGACTTATCAACTAAAAGGAGAAGATGCCACITTATCAGATGGTGGATC^ 

CTAATCATTTGTTAGAAATCCAAAATCGTATCCGAACAGAATTAGGACTTCATAAGGTTAATCAATTA^ 
20 AACAAATGCTACTGTTTATGAAAATTTGTATGGGTCAA 

TCTTCAGGCCAGGCTCC^TCTTATTCTGATAGT^TAGCTCTTACGCTAATTATTCAAGTGGAGTAGATA 

CCGGCCAGAGTGCTAGTACAGACCAGGACTCTACTGCTTCAAGCCATAGGCCAGCTAOTCCGTCTTCTTC 

ATCAGATGCTTTAGCAGCTGATGAGTCTAGCTCATCAGGGTCTGGA 

AACCCTCAGACCTAA 

25 

Preferred GAS 095 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 49; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 49, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 

30 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 095 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 49. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 49. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 

35 NO: 49. For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
SEQ ID NO: 49 is removed. Other fragments omit one or more domains of the protein (eg. omission 
of a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular 
domain). 

(26) GAS 193 

40 GAS 1 93 corresponds to Ml GenBank accession numbers GI: 1 3623029 and GI: 1 5675802, to M3 
GenBank accession number GI: 2191 1267, to M18 GenBank accession number GI: 19746914 and is 
also referred to as 'Spy2025' (Ml), 4 SpyM3_173r (M3), < SpyM18__2082' (M18) and *isp\ GAS 193 
has also been identified as an immunogenic secreted protein precursor. Amino acid and 
polynucleotide sequences of GAS 193 of an Ml strain are set forth below: 

45 SEQ ID NO: 51 

mkkrkliavtllstillnsavplwaotslrnstsst^ 

kdhkpshthptppsndtkqtdoasseatdkpnkdkndtkqpdssdqstpspkdqssqkesqnkdgrptps 

pdqqkdqtpdktpeksadktpekgpekatdktpep^dapkpiqpplaaapvfipwresdkdlsklkpss 
rssmyvrhvmsdsaythnllsrrygitaeq 

-26- 



11 1 i XVL-i 



AMAESSLGTQWAKEKGANMFGYGAFDFNPNNAKCT^ 

GQLDTLI DGGVYFTDTSGSGQRRADIMTKLDQWI DDHGSTPEI PBHLKI TSGTQFSEVPVG YKRSQPQNV 
LTYKSETYSFGQCTWYAYNRVKEIiGYQVDRYMGN^ 

HVAWEQI KEDGS I LI SESNVMGLGTI S YRTFTAEQASLLTYWGDKLPRP 

SEQ ID NO: 52 

ATGAAGAAAAGGAAATTGTTAGCAGTAACACTATTAAGTAC 
TTGTTGCTGATACCTCCTTGCGTAATAGCACATCATCCACTGATCAGCCTAC 

GGATGACGAGAGTGAAACACCAAAAAAAGACAAAAAAAGCAAGGAAACAGCGTCGCAGCACGACAC^ 
1 0 AAAGACCATAAGCCATCACACACTCACCCAACCCCCCCTTC 

CATCTGAAGCTACTGACAAACCAAATAAAGACAAAAACGACACCAAGCAACCAGA 

CACCCCATCTCCCAAAGACCAGTCGTCTC7UIAAAGAGTCACAAAACAAAGACGGCCGACCT 
CCTGATCAGCAAAAAGATCAGACACCTGATAAAACACCAGAAAAATCAGCTGA^ 

GACCAGAAAAAGCAACTGATAAAACACCAGAGCCAAATCGTGACX3CTCCAAAACCCATCCA^ 
1 5 AGCAGCTGCTCCTGTCTTTATACCTTGGAGAGAAAGTGA 

CGCTCATCAGCGGCTTACGTGAGACACTGGACAGGTGACTCTGC 

GTTATGGGATTACTGCTGAACAGCTAGATGGTTTTTTGAACAGTCT 

CTTAAACGGAAAGCGTTTATTAG^TGGGAAAAACTAACAGGACTAGACGTTCGAGCTATCG 
GCAATGGCAGAAAGCTCACTAGGTACTCAGGGAGTTGCTAAAGAAAAAGGAGCCAATATGTT^ 
20 GCGCCTTTGACTTCAACCCAAACAATGCCAAAAAATACAGCGATGAGGTTGCTATTCGT 
AGACACCATCATTGCCAACAAAAACO^CCTTTGAAAGACAAGACCTCAAAGCAA 
GGCCACTTGGATACCTTGATTGATGGTGGGGTTTACT^ 

CAGATATCATGACCAAACTAGACCAATGGATAGATGATCATGGAAGCACACCTGAGATTCCAGAACATCT 

CAAGATAACTTCCGGGACACAATTTAGCGAAGTGCCCGTAGGTTATAAAAGAAGTCAGCC^CAAAACGTT 
25 TTGACCTACAAGTCAGAGACCTACAGCTTT^ 

TAGGTTATCAAGTCGACAGGTAC^TGGGTAACGGTGGCGACTGGCAGCGCAAGCCAGGTTTTGTGACCAC 

CCATAAACCTAAAGTGGGCTATGTCGTCTCATTTGCACCAGGCCAAGCAGGAGCAGATC 
CACGTTGCTGTTGTAGAGCAAATCAAAGAAGATGGTTCTATC^ 

TAGGCACCATTTCCTATCGGACGTTCACAGCTGAGCAGGCTAGTTTGTT 
30 ACTCCCAAGACCATAA 

Preferred GAS 193 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 51; and/or (b) which is a fragment of at least n 

35 consecutive amino acids of SEQ ID NO: 51, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 193 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 51. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 51. Other preferred fragments lack one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 

40 amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 

NO: 51. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(27) GAS 137 

GAS 137 corresponds to Ml GenBank accession numbers GI:13621842, GI: 15674720 and 
45 GI:30173478, to M3 GenBank accession number GI:21909998, to M18 GenBank accession number 
GI: 19745749 and is also referred to as «Spy0652* (Ml), *SpyM3_0462\ and 4 SpyM18_0713' (M18). 
Amino acid and polynucleotide sequences of GAS 137 of an Ml strain are set forth below: 

SEQ ID NO: 53 

MSDKHI NLVI VTGMSGAGKTVAI QS FBDLGYFTI D1W PPALV PKFLELI EQTNENRRVALVVDMRSRLFF 
50 KEINSTLDSIESNPSIDFRILFLDATDGELVSRYKBTRRSHPLAAIXjRVlJXalRLERELLSPLKSMSQW 
VDTTKLTPRQLRKTI SDQFSEGSKQASFRI EVMS FGFKYGLPLDADLVFDVRFL PNPYYQVELREKTGLD 
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EDVFNYVMSHPESEVFYKHUiNLIVPILPAY 
BSHRDQNRRKETVNRS 

SEQ ID NO; 54 

5 ATGTC^GACAAACACATTAATTTAGTTATTC 

AGTCTTTTGAGGATCTAGGCTACTTTACCATT^ 

ATTAATTGAACAAACCAATGAAAATCGTAGGGTCGCTTTGGTTC 

AAGGAAATTAATTCTACCTTAGATAGTATTGAAAGCAATCCTAGCATTGATTT^ 

ATGCAACGGATGGAGAATTGGTGTCACGCTATAAAGAAACCAGACGGAGCCACCCTCT 
10 TCGTGTCCTTGATGGTATTCGATTGGAAAGAGAA 

GTGGATACAACAAAATTGACCCCTAGACAATTGCGTAAAAC 

ATCAAGCCTCTTTCCGTATTGAAGTGATGAGCTTTGGGTTCA^ 

GGTTTTTGATGTGCGTTTTCTACCCAATCCTTATTATC^ 

GAGGACGTTTTTAATTATGTGATGTCTCACCCAGAATCAGACX3TC 
1 5 TTGTCCCTATCTTACCGGCTTACCAAAAAGAAGGGAAGTCTGTCTTGACGGTGGCT 

AGGCCAACACCGCAGCGTTGCCTTTGCCCATTGCTTGGCAGAAAGTC 

GAAAGCCATCGTGATCAAAATCGTCGTAAGGAAACGGTGAATCGTTCATGA 

Preferred GAS 137 proteins for use with the invention comprise an amino acid sequence: (a) having 
20 50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 53; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 53, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 1 8, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 137 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 53. Preferred fragments 
25 of (b) comprise an epitope from SEQ ID NO: 53. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 53. Other fragments omit one or more domains of the protein (e.g. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

30 (28) GAS 084 

GAS 084 corresponds to Ml GenBank accession numbers GI:13622398 and GL15675229, to M3 
GenBank accession number GI: 21910442, to Ml 8 GenBank accession number GI: 19746199 and is 
also referred to as ^1274' (Ml), , SpyM3_0906' and < SpyM18J223 t (M18). GAS 084 has also 
been identified as a putative amino acid ABC transporter/periplasmic amino acid binding protein. 
35 Amino acid and polynucleotide sequences of GAS 084 of an Ml strain are set forth below: 

SEQ ID NO: SS 

M I I KKRTVAI LAIASSFFLVA CQATKSLKSGDAWGVYQKQKSITVGFDNTFVPMGYKDESGRCKGPDIDL 
AKEVFHQYGLKVNFQAINWDMKEAELNNGKI DVI WNGYS I TKERQDKVAFTDS YMRNEQI I WKKRSDI K 
TISDMKHKVLGAQSASSGYDSLLRTPKIAKDFIKNKDANQYETFTQAFIDLKSDRIDGILI 
40 AKEGQLENYRMI PTTFENEAFSVGLRKEDKTLQAKINRAFRVLYQNGKFQAI SEKWFGDDVATANI KS 

SEQ ID NO: 55 

ATGATTATAAAAAAAAGAACCGTAGCAATTTTAGCCATAGCTAGTAGCTTTTTCTTGGTAGCTTGTCAAG 
CTACTAAAAGTCTTAAATCAGGAG 

45 TGACAATACGTTTGTTCCTATGGGCTATAAGGATGAAAGC 

GCTAAAGAAGTTTTTCACCAATATGGACTCAAGGTTAACTTTCAAGCTATTAA 

CAGAACTAAACAATGGTAAAATTGATGTAATCTGGAATGGTTATTCAATAACTAAG 
GGTTGCCTTTACTGATTCTTACATGAGAAATGAACAAATTATTGTTGTCA 

ACAATATCAGATATGAAACATAAAGTGTTAGGAGCACAATCAGCTT 

50 GAACTCCTAAACTGCTGAAAGATTTTATTAAAAATAAAGACGCT 
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TTTTATTGATTTAAAATCAC^TCGTAT 

GCAAAAGAAGGGCAATTAGAGAATTATCGGATGATCCCAACGACC^^ 

GACTTAGAAAAGAAGACAAAACGTTGGAAGCAAAAATTAATCGTGC^ 

CAAATTTCAAGCTATTTCTGAGAAATGCTTTGGAGATGA 

5 

Preferred GAS 084 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 55; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 55, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

10 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 084 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 55. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 55. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 

1 5 NO: 55. For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
SEQ ID NO: 55 is removed, ther fragments omit one or more domains of the protein (eg. omission of 
a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(29) GAS 384 

GAS 384 corresponds to Ml GenBank accession numbers GI: 13622908 and GI: 1 5675693, to M3 
20 GenBank accession number GI: 21911 154, to M18 GenBank accession number GI: 19746801 and is 
also referred to as 'Spyl874' (Ml), *SpyM3_1618' (M3), and 'SpyM18_1939' (M18). GAS 384 has 
also been identified as a putative glycoprotein endopeptidase. Amino acid and polynucleotide 
sequences of GAS 384 of an Ml strain are set forth below: 

SEQ ID NO: 57 

25 MKTLAFOTSNKTLSLAILDDETLI^ 

GLRVAVATAKTLAYSLN I ALVGI SSLYALAASTCKQYPNTLWPLI DARRQNAYVGYYRQGKSVMPQAHA 
SLEVI IEQLVEEGQLI FVGETAPFAEKIQKKL PQAI LLPTLPSAYECGLLGQSIAPENVDAFVPQYLKRV 
EAEENWLKDNE I KDDSHYVKRI 

30 SEQ ID NO: 58 

ATGAAGACACTTGCATTTGATACCTCAAATAAAACCT 

TAGCAGATATGACCCTTAACATTCAGAAAAAACATAGTGTTAGCCTTATGCCT 

GACTTGTACTGATCTTAAACCTCAAGATTTAGAAAGAATAGTGGTTGCAAAAGGCCCTGGATC 

GGTTTACGAGTGGCAGTTGCTACTGCAAAAACGTTAGCGTAC 
35 CGAGTCTATATGCTTTGGCTGCGTCTACTTGTAAACAGTATCC 

TGCTAGAAGGGAAAATGCGTATGTAGGTTATTATCGGCAAGGAAAAT 

TCACTAGAAGTTATTATAGAACAATTAGTAGAAGAAGGACAG 

TTGCTGAGAAAATTCAAAAGAAACTACCTCAGGCGATACTACTTCCAACC 

TGGTCTTTTGGGGOIAAGTTTGGCACCAGAAAATGTAGACGCCTTTG 
40 GAAGCTGAAGAAAACTGGCTCAAAGATAATGAGATAAAAGATGATAGT 

Preferred GAS 384 proteins for use with the invention comprise an amino acid sequence: (a) having 

50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 57; and/or (b) which is a fragment of at least n 

45 consecutive amino acids of SEQ ID NO: 57, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 384 proteins include variants (eg. 

allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 57. Preferred fragments 
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of (b) comprise an epitope from SEQ ID NO: 57. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 57. Other fragments omit one or more domains of the protein (e.g. omission of a signal peptide, 
5 of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(30) GAS 202 

GAS 202 corresponds to Ml GenBank accession numbers GI: 13622431 and GI: 15675258, to M3 
GenBank accession number GI: 21910527, to MI 8 GenBank accession number GI: 19746290 and is 
also referred to as 'Spyl309' (Ml), 'SpyM3_0991' (M3), 'SpyMl 8J321* (M18) and 4 dltD\ GAS 
10 202 has also been identified as a putative extramembranal protein. Amino acid and polynucleotide 
sequences of GAS 202 of an Ml strain are set forth below: 

SEQ ID NO: 59 

MLKRLWLILGPLLIAFVliWITIFSFPTQLDHSIAQEKANAVAITDSSFKNGLIKRQALSDETCRFVPFF 
GSSEWSRMDSMHPSVXAERYKRSYRPFLIGKRGSASLSHYYGIQQITNEMQKKK^ 
15 PSAVQMYLSNTQVIEFLLKARTDKESQFAAKRLLELNPGVSK^ 
LREESLFSFLGKSTNYEKRILPRVKGLPKVFSYKQLNAL^ 
YKNFQVNYSYLASPEYNDFQLLLSEFAKRKTDVXFVITPVNK^ 
FHRIADFSKDGGESYFMQDTIHLGWNGWIAFDKKVQPFLET^ 

20 SEQ ID NO: 60 

ATGCTTAAGAGACTCTGGTTAATTCTAGGTCCTC 
TTAGTTTTCCTACACAACTTGATCATTCCATAGCTCAGGAAAAAG 
TTCTTTTAAAAATGGTTTGATCAAAAGACAAGCTTTAT 
GGTTCTAGCGAATGGAGTCGAATGGATAGTATGCACCCT^ 
25 ATAGACCATTTTTAATTGGTAAGAGAGGATCAGCATCT 

CAATGAAATGCAAAAGAAAAAAGCCATCTTTGTAGTATCTCCTCAATGGTTTACTGCT^ 

CCTAGTGCGGTTCAGATGTACTTGTCTAACACTCAAGTGATTC 

AAGAATC^CAGTTTGCAGCAAAGCGTTTGCTTGAGCTTAACCCTGGTGTC 

AAAAGTAAGTAAGGGTAAGTCTCTTAGTCGGTTAGACAGAGCTATTTTGAAATGTC71ACATCAAGTAGCA 
30 TTGAGAGAAGAGTCCCTTTTTAGTTTTTTAGGCAAATCTACTAACTATGAAAAAAGAATTTTGCCTCGCG 
TTAAGGGATTACCTAAAGTATTTTCGTATAAACAATTGAATGC^ 
AACAACCAACAACCGTTTTGGGATTAAAAATACATTT^^ 

TATMGAATTTCCAAGTTAATTATAGTTACCTGGCGTCACCAGAATACAATGATTT^ 
CAGAATTTGCTAAACGAAAAACAGATGTACTCTTTGTTATAACTCCTGTTAATAAAGCT 
35 TACCGGCTTAAATCAAGATAAGTATCAAGCGGCAGTTCGTAAAATAAAATTCCAGTTAAAGTCACAAGGA 
TTTCATCGC^TTGCTGACTTCTCAAAAGATGGTGGTGAGTCCTACTTTA 
GTTGGAATGGCTGGTTAGCTTTTGATAAGAAAGTGCAACCATTTCTAG 
CTATAAAATGAACCCTTATTTTTATAGTAAAATTTGGGCAAATAGGA 

40 Preferred GAS 202 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 59; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 59, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 202 proteins include variants (e.g. 

45 allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 59. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 59. Other preferred fragments lack one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
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NO: 59. Other fragments omit one or more domains of the protein (e.g. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(31) GAS 057 

GAS 057 corresponds to Ml GenBank accession numbers GI:13621655 and GI:15674549, to M3 
5 GenBank accession number GI: 21909834, to M18 GenBank accession number GI: 19745560 and is 
also referred to as 'SpytMrt' (Ml), *SpyM3 J)298* (M3), 'SpyM18j04i64' (M18) and 'prtS\ GAS 
057 has also been identified as a putative cell envelope proteinase. Amino acid and polynucleotide 
sequences of GAS 057 of an Ml strain are set forth below: 

SEQ ID NO: 61 

1 0 M EKKQRFSLRKYKSGTFSVLIGSVFL SAESKSQD 
TSQITLKTNI^KEQSQDLVSEPTTTELADTD 

KGQGKWAVI DTG I DPAHQSMR I SDVSTAKVKSKEDMLARQKAAGINYGSWINDKVVFAHNYVENSDNI K 

ENQFEDFDEDWENFBFDAEAEPKAIKKHKITOPQSTQAPKEWIKTEETDGSHDIDWTO 

MHVTGI VAGNSKEAAATGERFLGI APEAQVMFMRVFANDIMGSAESLFI KAI EDAVALGADVINLSLGTA 

1 5 NGAQLSGSKPLMEAI E KAKKAGVS VWAAGNERVYGSDHDD PLATN PD YGLVG S PS TGRT PT S VAA IN S K 
V^VIQRLMTVKELENRADLNHGKAI YS ESVDFKDI KDSIX3YDKSHQFAYVKESTDAGYNAQDVKGKI ALI E 
RDPNKTYDEMI ALAKKHGALGVLI FNNKPGQSNRSMRLTANGMGI PSAFI SHEFGKAMSQLNGNGTGSLE 
FDSWSKAPSQKGNEMNHFSNWGLTSDGYLKPDITAPGGDIYSTY^NHYGSQTGTS^SPQIAGASLLV 
KQ YLEKTQPNLPKEKIADIVKNLLMSNAQ IHVNPETKTTTS PRQQGAGLLMI DGAVTSGtiYVTGKDNYGS 

20 ISLGNITDTOTFT>VTVWLSNKDKTLRYOT 
VTMDVSQFTKELTKQMPNGYYLEGFTOF^ 

FYFDESGPKDDIYVGKHFTGLVTLGSETNVSTKTISDNGLHTI/3TFKNAIX5KF 

GDNNQDFAAFKGVFLRKYQGLKASWHASDKEHKNPLWVSPESFKGDKNFNSDIRF7UCSTTLLGTAFSGK 
SLTGAELPDGHYHYVVSYYPDWGAKRQEMTFDMILDRQKPVXSQATFD^ 
25 DS VT YJjERKDNKPYTVTINDS YKYVSVEDNKTFVERQADGS FI LPLDKAKLGDFYYMVEDFAGNVAIAKL 
GDHLPQTLGKTPI KLKLTDGOTQTKETLKDNLEMTQSDTGLVTNQAQLAWHRNQPQSQL S 
PNEDGNKDFVAFKGLKNtTVTNDLTVNVYAra 

YQYWTYRDEHGKEHQKQYT I SVNDKKPMITQGRFDTINGVDHFTPDKTKALDS SGI VREEVF YLAKKNG 
RKFD VTEGKDGITVSDNKVY I PKNPDGS YTI SKRDGVTLSDYYYLVEDRAGNVS FATLRDLKAVGKDKAV 
30 VNFGIJDLPVPEDKQIVTJFTYLVTIDADGKPIENI^ 

SFTLSADNNFQQVTFKITMIATSQITAHFDHLLPEGSRVSL 
EVWSLPKGYRIEGNTKVT^LPNEVTIELSLRLVKVGDASDSTGDHKVM 
AK ALPSTGEXWGLKLRI VGLVLLGLTCVFSRKKS TKD 

35 SEQ ID NO: 62 

GTGGAGAAAAAGCAACGTTTTTCCCTTAGAAAATACAAATCAGGAACGTTTTCGGTCTTAATAG 
TTTTCTTGGTGATGACAACAACAGTAGCAGCAGATGAGCTAAGCACAATnAnrn^ZirPa &r & &Tr arm a 

TCACGCTCAACAACAAGCXXIAACATCTCACCAATA 

ACATCACAAATCACTCTCAAGACAAATCGTGAAAAAGAGCAATC^ 

40 CAACTGAGCTAGCTOACACAGATGCAGCATCAATGGCTAATACAGGT^ 
TTCTTTACCGCCAGTCAATACAGATGTTCACGATTGGGTAAAA^ 

AAAGGACAAGGCAAGGTTGTCGCAGTTATTGACACAGGGATCGATCC 

GTGATGTATCAACTGCTAAAGTAAAATCAAAAGAAGACATGCTAGCACGC 
TTATGGGAGTTGGATAAATGATAAAGTTGTTTTTGCACATAA 
45 GAAAATCAATTCGAGGATTTTGATGAGGACTGGGAAAAC 

CCATCAAAAAACACAAGATCTATCGTCCCCAATCAACCCAGGCACCGAAAGAAACTGTTATCA 
AGAAACAGATGGTTCACATGATATTGACTGGACACAAACAGACGATGA^ 
ATGCATGTGACAGGTATTGTAGCCGGTAATAGCAAAGAAGCCGCTC 
TTGCACCAGAGGCCCAAGTCATGTTCATGCGTGTTTTTG 
50 CTTTATCAAAGCTATCGAAGATGCCGTGGCTTrAGGAGCA 

AATGGGGCACAGCTTAGTGGCAGCAAGCCTCTAATGGAAGCAATTGAAAAAGCTAAAA^ 
CAGTTGTTGTAGCAGCAGGAAATGAGCGCGTCTATGGATCTGACC^ 

AGACTATGGTTTGGTCGGTTCTCCCTCAACAGGTCGAACACCAACATCJVGTGGCA 
TGGGTGATTCAACGTCTAATGACGGTCAAAGAATTAGA 

55 TCTATTCAGAGTCTGTCGACTTTAAAGACATAAAAGAT^ 
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TTATGTCAAAGAGTCAACTGATGCGGGTTATAACGCACAAG^ 

CGTGATCCCAATAAAACCTATGACGAAATGATTGCTTTGGCTAAG 
TTTTTAATAACAAGCCTGGTGAATCAAACCXSCTCAATGC^ 
TGCTTTCATATCGCACGAATTTGGTJ^^ 
5 TTTGACAGTGTGGTCTCAAAAGCACO^GTCAAAAAGG 

TAACTTCTGATGGCTATTTAAAACCTGACATTACTGCACCAGGTGGCGATATCTACT 

TAACCACTATGGTAGCCAAACAGGAACAAGTATGGCCTCTCCTCAGATTGCTGGC 

AAACAATACCTAGAAAAGACTCAGCO\AACTTGC 

TGATGAGCAATGCTCAAATTCATGTTAATCCAGAGACAAA 
1 0 AGGATTACTTMTATTGACGGAGCTGTCACTAGCGGCCTTTATGT^ 

ATATCATTAGGCAACATCAC7VGATACGATGACGTTTGATGTGACTGTTCA 

AAACATTACGTTATGACACAGAATTGCTAACAGATCATGTAGACC 

TTCTCACTCCTTAAAAACGTACCAAGGAGGAGAAGTTACAGTCCCAGCCAATGGA 

GTTACCATGGATGTCTCACAGTTCACAAAAGAGCTAACAAAACAGATC 
1 5 GTTTTGTCCGCTTTAGAGATAGTCAAGATGACCAACTAAATAGAGTAAACATTCCTTTTC 

AGGGCAATTTGAAAACTTAGCAGTTGCAGAAGAGTCCATTTACAGATTAAAATCTCAAG 

TTTTACTTTGATGAATCAGGTCCAAAAGACGATATCTATGTC 

TTGGTTCAGAGACCAATGTGTCAACCAAAACGAT^ 

AAATGCAGATGGCAAATTTATCTTAGAAAAAAATGCCCAAGGAAACCCTGTCTTAGCCATT^ 
20 GGTGACAACAACCAAGATTTTGCAGCCTTCAAAGGTGTTTT^ 

GTGTCTACCATGCTAGTGACAAGGAACACAAAAATCCACTGTGGGTCAGCCCAGAAAGCTTTAAAGGAGA 

TAAAAACTTTAATAGTGACATTAGATTTGCAAAATCAACGACCCTGTTAGGtt 

TCGTTAACAGGAGCTGMTTACCAGATGGGCATTAT 

GTGCCAAACGTCMGAAATGAC^TTTGACATGATrTTAGACCGACAAAAA 

25 ATTTGATCCTGAAACAAACCGATTCAAACCAGAACCCCTAAAAGACCGTGGATTAGCTGGTGTTCG 
GACAGTGTCTTTTATCTAGAAAGAAAAGACAACAAGCCTTATA^^ 

ATGTCTCAGTAGAAGACAATAAAACATTTGTGGAGCGACAAGCTGATGGCAGCTTTATCTTGCCGCTTG 
TAAAGCAAAATTAGGGGATTTCTATTACATGGTCGAGGATTTTGCAGGGAA 

GGAGATCACTTACCACAAACATTAGGTAAAACACCAATTAAACTTAAGCTTACAGACGGTAATTATCAGA 
30 CCAAAGAAACGCTTAAAGATAATCTTGAAATGACACAGTCTGACACAGGTCTAGTCACAAATCAAG 

GCTAGCAGTGGTGCACCGCAATCAGCCGCAAAGCCAGCTAACAAAGATGAATCAGGATTTCTTTATCTCA 
CCAAACGAAGATGGGAATAAAGACTTTGTGGCCTTTAAAGGCTTGAAAAATAACGTGTA 
CGGTTAACGTATACGCTAAAGATGACCACCAAAAACAAACCCCTATCTGGTCTAGTCAAGCAGGCGCTAG 
TGTATCCGCTATTGAAAGTACAGCCTGGTATGGCATAACAGCCCGAGGAAGCAAGGTGATGCCAGGTGAT 
35 TATCAGTATGTTGTGACCTATCGTGACGAACATGGTAAAGAACATCAAAAGCAGTACACCATATCTGTGA 
ATGACAAAAAACCAATGATCACTCAGGGACGTTTTGATACCATTAATGGCGTTGACCACTTTACTC 
CAAGACAAAAGCCCTTGACTCATCAGXSCATTGTCCGCGAAGtf^ 

CGTAAATTTGATGTGACAGAAGGTAAAGATGGTATCACAGTTAGTGACAATAAGGTGTATATCCCTAAM 
ATCCAGATGGTTCTTACACCATTTCAAAAAGAGATGGTGTCACA 

40 AGATAGAGCTGGTMTGTGTCTTTTGCTACCTTGCGTGACCTAAAAGCGGTCGGAAAAGACAAAGCAGTA 
GTCAATTTTGGATTAGACTTACCGGTCCCTGAAGACAAACAAATAGTGAACTTTACCTACCTTGTGCG^ 
ATGCAGATGGTAAACCGATTGAAAACCTAGAGTATTATAATAACTCAGGTAAGAGTCTTATCTTGCCATA 
CGGCAAATACACGGTCGAATTGTTGACCTATGACACCAATGCAGCCAAACTAGAGTCAGATAAAATCGTT 
TCCTTTACCTTGTCAGCTGATAAC^CTTCCAACAAGTTACCTTTAAGATAACGATC 

45 AAATAACTGCCCACTTTGATCATCTTTTGCCAGAAGGCAGTCGCGTTAGCCTTAAAACAGCTCAA 

GCTAATCCCGCTTGAACAGTCCTTGTATGTGCCTAAAGCTTATGGCAAAACCGTTCAAGAAGGCACTTAC 

GAAGTTGTTGTCAGCCTGCCTAAAGGCTACCGTATCGAAGGCAACACAAAGGTGAATACCCTACCAAA 

AAGTGCACGAACTATCATTACGCCTTGTCAAAGTAGGAGATGCCTCAGATTCAACTGGTGATCATAAGGT 
TATGTCAAAAAATAATTCACAGGCTTTGACAGCCTCTC 

50 GCAAAAGCCCTACCATCAACGGGTGAAAAAATGGGTCTCAAGTTGCGCATAGTAGGTCTTC 
GACTTACTTGCGTCTTTAGCCGAAAAAAATCAACCAAAGATTGA 

Preferred GAS 057 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
55 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 61; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 61, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 057 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 61. Preferred fragments 
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of (b) comprise an epitope from SEQ ID NO: 61. Other preferred fragments lack one or more amino 
acids (e.g. 1,2,3,4,5,6,7,8,9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 61. For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
5 SEQ ID NO: 61 is removed. In another example, the underlined amino acid sequence at the C- 
terminus of SEQ ID NO: 61 is removed. Other fragments omit one or more domains of the protein 
(e.g. omission of a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an 
extracellular domain). 

The immunogenicity of other known GAS antigens may be improved by combination with two or 
1 0 more GAS the first antigen group. Such other known GAS antigens include a second antigen group 
consisting of (1) one or more variants of the M surface protein or fragments thereof, (2) fibronectin- 
binding protein, (3) streptococcal heme-associated protein, or (4) SagA. These antigens are referred 
to herein as the "second antigen group" 

The invention thus includes an immunogenic composition comprising a combination of GAS 
1 5 antigens, said combination consisting of two to thirty-one GAS antigens of the first antigen group and 
one, two, three, or four GAS antigens of the second antigen group. Preferably, the combination 
consists of three, four, five, six, seven, eight, nine, or ten GAS antigens from the first antigen group. 
Still more preferably, the combination consists of three, four or five GAS antigens from the first 
antigen group. Preferably, the combination of GAS antigens includes either or both of GAS 40 and 
20 GAS 1 17. Preferably, the combination of GAS antigens includes one or more variants of the M 
surface protein. 

Each of the GAS antigens of the second antigen group are described in more detail below. 
(1) M surface protein 

Over 1 00 different type variants of the M protein have been identified. Epitopes having increased 
25 bactericidal activity and having decreased likelihood of cross-reacting with human tissues have been 
identified in the amino terminal region and combined into fusion proteins containing approximately 
six, seven, or eight M protein fragments linked in tandem. See Ref. 4, 5, 6, WO 02/094851 and WO 
94/06465. (Each of the M protein variants, fragments and fusion proteins described in these 
references are specifically incorporated herein by reference.) 

30 Accordingly, the compositions of the invention may further comprise a GAS M surface protein or a 
fragment or derivative thereof. One or more GAS M surface protein fragments may be combined 
together in a fusion protein. Alternatively, one or more GAS M surface protein fragments are 
combined with a GAS antigen or fragment thereof of the first antigen group. One example of a GAS 
M protein is set forth below. 

35 SEQ ID NO: 63 

MAKNNTNRHYSLRKLKTGTASVAVALTVLGAGFANQTBVKANGIX5NPREVI EDLAANN PAI QN I RLRYEN 
KDLKARI^NAMEVAGRDFKRABELEKAKQAI^^ 
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KEAI^LMDQASRDYKRATALEKELEEKKKALELAIDQ^ 
KLBLDQLSSEKEQLTIEKAKLEBBKQISDASRQSIJWDI^ 
DASRQGLRRDLDASRBAKKQV^KDLANLTAELDKVKEEK^ 
NSKLAALEKLNKEIjEESKKLTEKBKAELQAKIjEAEAK^ 
5 KAVPGKGQAPQAGTKPNQKXA PMXE TKRQL PSTGETAN PFFTAAALTVMATAGVAAWKRKE EN 

Preferred GAS M proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (a*. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 63; and/or (b) which is a fragment of at least n 

10 consecutive amino acids of SEQ ID NO: 63, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS M proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 63. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 63. Preferably, the fragment is one of those described in the 
references above. Preferably, the fragment is constructed in a fusion protein with one or more 

1 5 additional M protein fragments. Other preferred fragments lack one or more amino acids (e.g. 1,2,3, 
4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more amino acids (e.g. 1 , 2, 
3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 63. Other fragments 
omit one or more domains of the protein (eg. omission of a signal peptide, of a cytoplasmic domain, 
of a transmembrane domain, or of an extracellular domain). 

20 (2) Fibronectin-binding protein 

GAS fibronectin-binding protein ('Sfbr) is a mutlifiinctional bacterial protein thought to mediate 
attachment of the bacteria to host cells, facilitate bacterial internalization into cells and to bind to the 
Fc fragment of human IgG, thus interfering with Fc-receptor mediated phagocytosis and antibody- 
dependent cell cytotoxicity. Immunization of mice with Sfbl and an 'H12 fragment' (encoded by 
25 positions 1240 - 1854 of the Sfbl gene) are discussed in Refs. 7,8 and 9. One example of an amino 
acid sequence for GAS Sfbl is show below. 

SEQ ID NO: 64 

MSFDGF FLHHLTNELKENLLYGRIQKVNQPFERBLVLTIRNHRKNYKLL1»SAHPVFGRVQI TQADFQN PQ 
VPNTFTMIMRKYLQGAVIEQLEQIDNDRI I EI KVSNKNE I GDAIQATLI I EIMGKHSNI ILVDRAENKI I 

30 ESIKHVGFSQNSYRTILPGSTYIEPPKTAAWPFTITDVPLFEILQTQELTO 

AELLTTDKLKRFRE FFARPTQANLTTAS FAPVLFSDSHATFETLSDMLDHF YQDKAERDR INQQASDLI H 
RVOTEIJ3KNRNKLSKQEAE3j1iATENAELFRQKGELLTTYLiSLVPNNQDSVI LDNYYTGEKI EI ALDKALT 
PNQNAQR YFKKYQKLKEAVKHLSGLI ADTKQS I TYFES VDYNLSQAS I DDI ED IREELYQAGFLKSRQRD 
KRHKRKKPEQY^DGTTILMVGRNN^ 

3 5 AELAAY YSKARLSNLVQVDMI EAKKLHKPSGAKPGFVTYTGQKTLRVTPDQAKI LSMKLS 

Preferred Sfbl proteins for use with the invention comprise an amino acid sequence: (a) having 50% 

or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

97%, 98%, 99%, 99.S% or more) to SEQ ID NO: 64; and/or (b) which is a fragment of at least n 

40 consecutive amino acids of SEQ ID NO: 64, wherein n is 7 or more (eg. 8, 10, 12, 14, 1 6, 18, 20, 25, 

30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These Sfbl proteins include variants (eg. allelic variants, 

homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 64. Preferred fragments of (b) comprise 

an epitope from SEQ ID NO: 64. Preferably, the fragment is one of those described in the references 

above. Other preferred fragments lack one or more amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 
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20, 25 or more) from the C-terminus and/or one or more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 
15, 20, 25 or more) from the N-terminus of SEQ ID NO: 64. Other fragments omit one or more 
domains of the protein (eg. omission of a signal peptide, of a cytoplasmic domain, of a 
transmembrane domain, or of an extracellular domain). 

5 (3) Streptococcal heme-associated protein 

The GAS streptococcal heme-associated protein ( c Shp*) has been identified as a GAS cell surface 
protein. It is thought to be cotrascribed with genes encoding homologues of an ABC transporter 
involved in iron uptake in gram-negative bacteria The Shp protein is further described in 10. One 
example of a Shp protein is shown below: 

10 SEQ ID NO: 65 

MTKVVIKQLl^VIWFMISLSTMTNLVTAD^ 

WSDAMLEVSDAGKIVLTFRMSXJUDYSGNYQFWIQPGGTGSFQAVDYNITQKGTDTNGTTLDIAISLPW 
NSI I RGSMFV^PMGREWFYLSASBLIQKYSGNM1AQLVTETDNSQNQBVKDSQKPVT)TKLGBSQDESHT 
GAMITQNKPKANSSNNKSLSDKKILPSKMGLTTSLEL 
15 WKKRKKNDKTM 

Preferred Shp proteins for use with the invention comprise an amino acid sequence: (a) having 50% or 
more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 
98%, 99%, 99.5% or more) to SEQ ID NO: 65; and/or (b) which is a fragment of at least n 

20 consecutive amino acids of SEQ ID NO: 65, wherein n is 7 or more (eg. 8,10, 12, 14, 16, 18,20,25, 
30, 35, 40, 50, 60, 70, 80, 90, 100 or more). These Shp proteins include variants (eg. allelic variants, 
homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 65. Preferred fragments of (b) comprise 
an epitope from SEQ ID NO: 65. Other preferred fragments lack one or more amino acids (e.g. 1, 2, 
3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-termihus and/or one or more amino acids (e.g. 1, 

25 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 65. Other fragments 
omit one or more domains of the protein (eg. omission of a signal peptide, of a cytoplasmic domain, 
of a transmembrane domain, or of an extracellular domain). 

(4) Sag A 

Streptolysin S (SLS), also known as 'SagA', is thought to be produced by almost all GAS colonies. 
30 This cytolytic toxin is responsible for the beta-hemolysis surrounding colonies of GAS grown on 
blood agar and is thought to be associated with virulence. While the full SagA peptide has not been 
shown to be immunogenic, a fragment of amino acids 10-30 (SagA 10-30) has been used to 
produce neutralizing antibodies. See Ref. 1 1 . The amino acid sequence of SagA 10 - 30 is shown 
below: 

35 SEQ ID NO: 66 FSIATGSGNSQGGSGSYTPGKC 

Preferred SagA 10-30 proteins for use with the invention comprise an amino acid sequence: (a) 
having 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 
95%, 96%, 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 66; and/or (b) which is a fragment of at 
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least n consecutive amino acids of SEQ ID NO: 66, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 
or 20). These SagA 10 - 30 proteins include variants (ag. allelic variants, homologs, orthologs, 
paraJogs, mutants, etc) of SEQ ID NO: 66. 

There is an upper limit to the number of GAS antigens which will be in the compositions of the 
5 invention. Preferably, the number of GAS antigens in a composition of the invention is less than 20, 
less than 19, less than 18, less than 17, less than 16, less than 15, less than 14, less than 13, less than 
12, less than 11, less than 10, less than 9, less than 8, less than 7, less than 6, less than 5, less than 4, 
or less than 3. Still more preferably, the number of GAS antigens in a composition of the invention is 
less than 6, less than 5, or less than 4. Still more preferably, the number of GAS antigens in a 
1 0 composition of the invention is 3. 

The GAS antigens used in the invention are preferably isolated, i.e., separate and discrete, from the 
whole organism with which the molecule is found in nature or, when the polynucleotide or 
polypeptide is not found in nature, is sufficiently free of other biological macromolecules so that the 
polynucleotide or polypeptide can be used for its intended purpose. 

15 Fusion proteins 

The GAS antigens used in the invention may be present in the composition as individual separate 
polypeptides, but it is preferred that at least two (i.e. 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 
18, 19 or 20) of the antigens are expressed as a single polypeptide chain (a 'hybrid' polypeptide). 
Hybrid polypeptides offer two principal advantages: first, a polypeptide that may be unstable or 
20 poorly expressed on its own can be assisted by adding a suitable hybrid partner that overcomes the 
problem; second, commercial manufacture is simplified as only one expression and purification need 
be employed in order to produce two polypeptides which are both antigenically useful. 

The hybrid polypeptide may comprise two or more polypeptide sequences from the first antigen 
group. Accordingly, the invention includes a composition comprising a first amino acid sequence and 
25 a second amino acid sequence, wherein said first and second amino acid sequences are selected from a 
GAS antigen or a fragment thereof of the first antigen group. Preferably, the first and second amino 
acid sequences in the hybrid polypeptide comprise different epitopes. 

The hybrid polypeptide may comprise one or more polypeptide sequences from the first antigen group 
and one or more polypeptide sequences from the second antigen group. Accordingly, the invention 
30 includes a composition comprising a first amino acid sequence and a second amino acid sequence, 
said first amino acid sequence selected from a GAS antigen or a fragment thereof from the first 
antigen group and said second amino acid sequence selected from a GAS antigen or a fragment 
thereof from the second antigen group. Preferably, the first and second amino acid sequences in the 
hybrid polypeptide comprise different epitopes. 
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Hybrids consisting of amino acid sequences from two, three, four, five, six, seven, eight, nine, or ten 
GAS antigens are preferred. In particular, hybrids consisting of amino acid sequences from two, 
three, four, or five GAS antigens are preferred. 

Different hybrid polypeptides may be mixed together in a single formulation. Within such 
5 combinations, a GAS antigen may be present in more than one hybrid polypeptide and/or as a 

non-hybrid polypeptide. It is preferred, however, that an antigen is present either as a hybrid or as a 
non-hybrid, but not as both. 

Hybrid polypeptides can be represented by the formula NH 2 -A-{-X-Lr} fl -R-COOH, wherein: X is an 
amino acid sequence of a GAS antigen or a fragment thereof from the first antigen group or the 
10 second antigen group; L is an optional linker amino acid sequence; A is an optional N-tenninal amino 
acid sequence; B is an optional C-terminal amino acid sequence; and n is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 
12, 13, 14 or 15. 

If a -X- moiety has a leader peptide sequence in its wild-type form, this may be included or omitted in 
the hybrid protein. In some embodiments, the leader peptides will be deleted except for that of the -X- 
15 moiety located at the N-terminus of the hybrid protein i.e. the leader peptide of Xj will be retained, 
but the leader peptides of X 2 . . . X„ will be omitted. This is equivalent to deleting all leader peptides 
and using the leader peptide of Xi as moiety -A-. 

For each n instances of {-X-L-} , linker amino acid sequence -L- may be present or absent. For 
instance, when n=2 the hybrid may be NH 2 -XrLrX 2 -L2-COOH, NH 2 -X,-X r COOH, NH 2 -X,-L r X 2 - 

20 COOH, NH 2 -X r X 2 -L2-COOH, etc. Linker amino acid sequence(s) -L- will typically be short (eg. 20 
or fewer amino acids i.e 19, 18, 17, 16, 15, 14, 13, 12, U, 10, 9, 8,. 7, 6, 5, 4, 3, 2, 1). Examples 
comprise short peptide sequences which facilitate cloning, poly-glycine linkers (i.e. comprising Gly„ 
where n - 2, 3, 4, 5, 6, 7, 8, 9, 10 or more), and histidine tags (Le. His, where n = 3, 4, 5, 6, 7, 8, 9, 10 
or more). Other suitable linker amino acid sequences will be apparent to those skilled in the art. A 

25 useful linker is GSGGGG, with the Gly-Ser dipeptide being formed from a BamWl restriction site, 
thus aiding cloning and manipulation, and the (Gly) 4 tetrapeptide being a typical poly-glycine linker. 

-A- is an optional N-terminal amino acid sequence. This will typically be short (eg. 40 or fewer 
amino acids i.e. 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 
16, 15, 14, 13, 12, 11, 10,9,8, 7,6,5,4,3,2, 1). Examples include leader sequences to direct protein 
30 trafficking, or short peptide sequences which facilitate cloning or purification (eg. histidine tags i.e. 
His„ where n = 3, 4, 5, 6, 7, 8, 9, 10 or more). Other suitable N-terminal amino acid sequences will be 
apparent to those skilled in the art. If Xj lacks its own N-terminus methionine, -A- is preferably an 
oligopeptide (eg. with 1 , 2, 3, 4, 5, 6, 7 or 8 amino acids) which provides a N-terminus methionine. 

-B- is an optional C-terminal amino acid sequence. This will typically be short (eg. 40 or fewer 

35 amino acids Le. 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 

16, 15, 14, 13, 12, 11, 10,9, 8, 7,6, 5,4,3, 2, 1). Examples include sequences to direct protein 

trafficking, short peptide sequences which facilitate cloning or purification (eg. comprising histidine 

-37- 



« 



All! SS£*r Ui.vvvj,w» 



tags i.e His„ where n «= 3, 4, 5, 6, 7, 8, 9, 10 or more), or sequences which enhance protein stability. 
Other suitable C-terminal amino acid sequences will be apparent to those skilled in the art. 

Most preferably, n is 2 or 3. 

The invention also provides nucleic acid encoding hybrid polypeptides of the invention. Furthermore, 
5 the invention provides nucleic acid which can hybridise to this nucleic acid, preferably under M high 
stringency" conditions (eg. 65°C in a O.lxSSC, 0.5% SDS solution). 

Polypeptides of the invention can be prepared by various means (e.g. recombinant expression, 
purification from cell culture, chemical synthesis, etc.) and in various forms (eg. native, fusions, 
non-glycosylated, lipidated, etc.). They are preferably prepared in substantially pure form (i.e 
1 0 substantially free from other GAS or host cell proteins). 

Nucleic acid according to the invention can be prepared in many ways (eg. by chemical synthesis, 
from genomic or cDNA libraries, from the organism itself, etc.) and can take various forms (eg. 
single stranded, double stranded, vectors, probes, etc.). They are preferably prepared in substantially 
pure form (i.e substantially free from other GAS or host cell nucleic acids). 

15 The term "nucleic acid" includes DNA and RNA, and also their analogues, such as those containing 
modified backbones (eg. phosphorothioates, efc), and also peptide nucleic acids (PNA), etc. The 
invention includes nucleic acid comprising sequences complementary to those described above (eg. 
for antisense or probing purposes). 

The invention also provides a process for producing a polypeptide of the invention, comprising the 
20 step of culturing a host cell transformed with nucleic acid of the invention under conditions which 
induce polypeptide expression. 

Hie invention provides a process for producing a polypeptide of the invention, comprising the step of 
synthesising at least part of the polypeptide by chemical means. 

The invention provides a process for producing nucleic acid of the invention, comprising the step of 
25 amplifying nucleic acid using a primer-based amplification method (eg. PCR). 

The invention provides a process for producing nucleic acid of the invention, comprising the step of 
synthesising at least part of the nucleic acid by chemical means. 

Strains 

Preferred polypeptides of the invention comprise an amino acid sequence found in an Ml, M3 or M18 
30 strain of GAS. The genomic sequence of an Ml GAS strain is reported at Ref. 12. The genomic 

sequence of an M3 GAS strain is reported at Ref. 13. The genomic sequence of an Ml 8 GAS strain is 
reported at Ref. 14. 

Where hybrid polypeptides are used, the individual antigens within the hybrid (i.e individual -X- 
moieties) may be from one or more strains. Where n=2, for instance, X 2 may be from the same strain 
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as X, or from a different strain. Where n=3, the strains might be (0 X,=X 2 =X 3 (ii) X,=X 2 « 3 (in) 
X,« 2 =X 3 (iv) X,« 2 «3 or (v) X^XjflCi, #c 

■ 

Purification and Recombinant Expression 

The GAS antigens of the invention may be isolated from a Streptococcus pyogenes, or they may be 
5 recombinantly produced, for instance, in a heterologous host. Preferably, the GAS antigens are 
prepared using a heterologous host. The heterologous host may be prokaryotic (e.g. a bacterium) or 
eukaryotic. It is preferably E.coli, but other suitable hosts include Bacillus subtilis, Vibrio cholerae. 
Salmonella typhi, Salmonella typhimurium t Neisseria lactamica, Neisseria cinerea t Mycobacteria 
(eg. M.tuberculosis), yeasts, etc. 

10 Recombinant production of polypeptides is facilitated by adding a tag protein to the GAS antigen to 
be expressed as a fusion protein comprising the tag protein and the GAS antigen. Such tag proteins 
can facilitate purification, detection and stability of the expressed protein. Tag proteins suitable for 
use in the invention include a polyarginine tag (Arg-tag), polyhistidine tag (His-tag), FLAG-tag, 
Strep-tag, c-myc-tag, S-tag, calmodulin-binding peptide, cellulose-binding domain, SBP-tag,, chitin- 

1 5 binding domain, glutathione S-transferase-tag (GST), maltose-binding protein, transcription 
termination anti-terminiantion factor (NusA), E. coli thioredoxin (TrxA) and protein disulfide 
isomerase I (DsbA). Preferred tag proteins include His-tag and GST. A full discussion on the use of 
tag proteins can be found at Ref. 1 5. 

After purification, the tag proteins may optionally be removed from the expressed fusion protein, i.e., 
20 by specifically tailored enzymatic treatments known in the art. Commonly used proteases include 
enterokinase, tobacco etch virus (TEV), thrombin, and factor X a . 

Immunogenic compositions and medicaments 

Compositions of the invention are preferably immunogenic compositions, and are more preferably 
vaccine compositions. The pH of the composition is preferably between 6 and 8, preferably about 7. 
25 The pH may be maintained by the use of a buffer. The composition may be sterile and/or 
pyrogen-free. The composition may be isotonic with respect to humans. 

Vaccines according to the invention may either be prophylactic (j.e to prevent infection) or 
therapeutic (i.e. to treat infection), but will typically be prophylactic. Accordingly, the invention 
includes a method for the therapeutic or prophylactic treatment of a Streptococcus pyogenes infection 

30 in an animal susceptible to streptococcal infection comprising administering to said animal a 

therapeutic or prophylactic amount of the immunogenic compositions of the invention. Preferably, 
the immunogenic composition comprises a combination of GAS antigens, said combination consisting 
of two to thirty-one GAS antigens of the first antigen group. Preferably, the combination of GAS 
antigens consists of three, four, five, six, seven, eight, nine, or ten GAS antigens selected from the 

35 first antigen group. Preferably, the combination of GAS antigens consists of three, four, or five GAS 
antigens selected from the first antigen group. Preferably, the combination of GAS antigens includes 
either or both of GAS 40 and GAS 1 1 7. 
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Alternatively, the invention includes an immunogenic composition comprising a combination of GAS 
antigens, said combination consisting of two to thirty-one GAS antigens of the first antigen group and 
one, two, three, or four GAS antigens of the second antigen group. Preferably, the combination 
consists of three, four, five, six, seven, eight, nine, or ten GAS antigens from the first antigen group. 
5 Still more preferably, the combination consists of three, four or five GAS antigens from the first 
antigen group. Preferably, the combination of GAS antigens includes either or both of GAS 40 and 
GAS 1 1 7. Preferably, the combination of GAS antigens includes one or more variants of the M 
surface protein. 

The invention also provides a composition of the invention for use as a medicament. The medicament 
1 0 is preferably able to raise an immune response in a mammal (i.e. it is an immunogenic composition) 
and is more preferably a vaccine. 

The invention also provides the use of the compositions of the invention in the manufacture of a 
medicament for raising an immune response in a mammal. The medicament is preferably a vaccine. 

The invention also provides for a kit comprising a first component comprising a combination of GAS 
1 5 antigens. In one embodiment, the combination of GAS antigens consists of a mixture of two to thirty- 
one GAS antigens selected from the first antigen group. Preferably, the combination consists of three, 
four, five, six, seven, eight, nine, or ten GAS antigens from the first antigen group. Preferably, the 
combination consists of three, four, or five GAS antigens from the first antigen group. Preferably, the 
combination includes either or both of GAS 117 and GAS 040. 

20 In another embodiment, the kit comprises a first component comprising a combination of GAS 

antigens consisting of a mixture of two to thirty-one GAS antigens of the first antigen group and one, 
two, three, or four GAS antigens of the second antigen group. Preferably, the combination consists of 
three, four, five, six, seven, eight, nine, or ten GAS antigens from the first antigen group. Still more 
preferably, the combination consists of three, four or five GAS antigens from the first antigen group. 

25 Preferably, the combination of GAS antigens includes either or both of GAS 40 and GAS 1 1 7. 

Preferably, the combination of GAS antigens includes one or more variants of the M surface protein. 

The invention also provides a delivery device pre-filled with the immunogenic compositions of the 
invention. 

The invention also provides a method for raising an immune response in a mammal comprising the 
30 step of administering an effective amount of a composition of the invention. The immune response is 
preferably protective and preferably involves antibodies and/or cell-mediated immunity. The method 
may raise a booster response. 

The mammal is preferably a human. Where the vaccine is for prophylactic use, the human is 
preferably a child (e.g. a toddler or infant) or a teenager; where the vaccine is for therapeutic use, the 
35 human is preferably a teenager or an adult. A vaccine intended for children may also be administered 
to adults eg. to assess safety, dosage, immunogenicity, etc. 
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These uses and methods are preferably for the prevention and/or treatment of a disease caused by 
Streptococcus pyogenes (eg. pharyngitis (such as streptococcal sore throat), scarlet fever, impetigo, 
erysipelas, cellulitis, septicemia, toxic shock syndrome, necrotizing fasciitis (flesh eating disease) and 
sequelae (such as rheumatic fever and acute glomerulonephritis)). The compositions may also be 
5 effective against other streptococcal bacteria. 

One way of checking efficacy of therapeutic treatment involves monitoring GAS infection after 
administration of the composition of the invention. One way of checking efficacy of prophylactic 
treatment involves monitoring immune responses against the GAS antigens in the compositions of the 
invention after administration of the composition. 

10 Compositions of the invention will generally be administered directly to a patient. Direct delivery may 
be accomplished by parenteral injection (e.g. subcutaneously, intraperitoneally, intravenously, 
intramuscularly, or to the interstitial space of a tissue), or by rectal, oral (eg. tablet, spray), vaginal, 
topical, transdermal {eg. see ref. 16} or transcutaneous {eg. see refs. 17 & 18}, intranasal {eg. see 
ref. 19}, ocular, aural, pulmonary or other mucosal administration. 

1 5 The invention may be used to elicit systemic and/or mucosal immunity. 

Dosage treatment can be a single dose schedule or a multiple dose schedule. Multiple doses may be 
used in a primary immunisation schedule and/or in a booster immunisation schedule. In a multiple 
dose schedule the various doses may be given by the same or different routes eg. a parenteral prime 
and mucosal boost, a mucosal prime and parenteral boost, etc. 

20 The compositions of the invention may be prepared in various forms. For example, the compositions 
may be prepared as injectables, either as liquid solutions or suspensions. Solid forms suitable for 
solution in, or suspension in, liquid vehicles prior to injection can also be prepared (eg. a lyophilised 
composition). The composition may be prepared for topical administration eg. as an ointment, cream 
or powder. The composition may be prepared for oral administration eg. as a tablet or capsule, as a 

25 spray, or as a syrup (optionally flavoured). The composition may be prepared for pulmonary 

administration eg. as an inhaler, using a fine powder or a spray. The composition may be prepared as 
a suppository or pessary. The composition may be prepared for nasal, aural or ocular administration 
eg. as drops. The composition may be in kit form, designed such that a combined composition is 
reconstituted just prior to administration to a patient. Such kits may comprise one or more antigens in 

30 liquid form and one or more lyophilised antigens. 

Immunogenic compositions used as vaccines comprise an immunologically effective amount of 
antigen(s), as well as any other components, as needed. By 'immunologically effective amount', it is 
meant that the administration of that amount to an individual, either in a single dose or as part of a 
series, is effective for treatment or prevention. This amount varies depending upon the health and 
35 physical condition of the individual to be treated, age, the taxonomic group of individual to be treated 
(eg. non-human primate, primate, etc.), the capacity of the individual's immune system to synthesise 
antibodies, the degree of protection desired, the formulation of the vaccine, the treating doctor's 
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assessment of the medical situation, and other relevant factors. It is expected that the amount will fall 
in a relatively broad range that can be determined through routine trials. 

Further components of the composition 

The composition of the invention will typically, in addition to the components mentioned above, 
5 comprise one or more 'pharmaceutical^ acceptable carriers', which include any carrier that does not 
itself induce the production of antibodies harmful to the individual receiving the composition. 
Suitable carriers are typically large, slowly metabolised macromolecules such as proteins, 
polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, amino acid copolymers, 
and lipid aggregates (such as oil droplets or liposomes). Such carriers are well known to those of 
10 ordinary skill in the art. The vaccines may also contain diluents, such as water, saline, glycerol, etc. 
Additionally, auxiliary substances, such as wetting or emulsifying agents, pH buffering substances, 
and the like, may be present. A thorough discussion of pharmaceutical acceptable excipients is 
available in reference 20. 

Vaccines of the invention may be administered in conjunction with other immunoregulatory agents. In 
1 S particular, compositions will usually include an adjuvant 

Preferred further adjuvants include, but are not limited to, one or more of the following set forth 
below: 

♦ 

A. Mineral Containing Compositions 

Mineral containing compositions suitable for use as adjuvants in the invention include mineral salts, 
20 such as aluminium salts and calcium salts. The invention includes mineral salts such as hydroxides 
(e.g. oxyhydroxides), phosphates (eg. hydroxyphoshpates, orthophosphates), sulphates, etc. {eg. see 
chapters 8 & 9 of ref. 21}), or mixtures of different mineral compounds, with the compounds taking 
any suitable form (eg. gel, crystalline, amorphous, eta), and with adsorption being preferred. The 
mineral containing compositions may also be formulated as a particle of metal salt. See ref. 22. 

25 B. Oil-Emulsions 

Oil-emulsion compositions suitable for use as adjuvants in the invention include squalene-water 
emulsions, such as MF59 (5% Squalene, 0.5% Tween 80, and 0.5% Span 85, formulated into 
submicron particles using a microfluidizer). See ref. 23. 

Complete Freund's adjuvant (CFA) and incomplete Freund's adjuvant (IFA) may also be used as 
30 adjuvants in the invention. 

C. Saponin Formulations 

Saponin formulations, may also be used as adjuvants in the invention. Saponins are a heterologous 
group of sterol glycosides and triterpenoid glycosides that are found in the bark, leaves, stems, roots 
and even flowers of a wide range of plant species. Saponin from the bark of the Quillaia saponaria 
35 Molina tree have been widely studied as adjuvants. Saponin can also be commercially obtained from 
Smilax ornata (sarsaprilla), Gypsophilla paniculata (brides veil), and Saponaria qfficianalis (soap 
root). Saponin adjuvant formulations include purified formulations, such as QS21, as well as lipid 
formulations, such as ISCOMs. 
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Saponin compositions have been purified using High Performance Thin Layer Chromatography (HP- 
LC) and Reversed Phase High Performance Liquid Chromatography (RP-HPLC). Specific purified 
fractions using these techniques have been identified, including QS7, QS17, QS18, QS21, QH-A, QH- 
B and QH-C. Preferably, the saponin is QS21. A method of production of QS21 is disclosed in U.S. 
5 Patent No. 5,057,540. Saponin formulations may also comprise a sterol, such as cholesterol (see WO 
96/33739). 

Combinations of saponins and cholesterols can be used to form unique particles called 
Immunostimulating Complexs (ISCOMs). ISCOMs typically also include a phospholipid such as 
phosphatidylethanolamine or phosphatidylcholine. Any known saponin can be used in ISCOMs. 
1 0 Preferably, the ISCOM includes one or more of Quil A, QHA and QHC. ISCOMs are further 
described in EP 0 109 942, WO 96/1 171 1 and WO 96/33739. Optionally, the ISCOMS may be 
devoid of additional detergent. See ref. 24. 

A review of the development of saponin based adjuvants can be found at ref. 25. 

C. Virosomes and Virus Like Particles (VLPsl 

15 Virosomes and Virus Like Particles (VLPs) can also be used as adjuvants in the invention. These 

structures generally contain one or more proteins from a virus optionally combined or formulated with 
a phospholipid. They are generally non-pathogenic, non-replicating and generally do not contain any 
of the native viral genome. The viral proteins may be recombinandy produced or isolated from whole 
viruses. These viral proteins suitable for use in virosomes or VLPs include proteins derived from 

20 influenza virus (such as HA or NA), Hepatitis B virus (such as core or capsid proteins), Hepatitis E 
virus, measles virus, Sindbis virus, Rotavirus, Foot-and-Mouth Disease virus, Retrovirus, Norwalk 
virus, human Papilloma virus, HIV, RNA-phages, QB-phage (such as coat proteins), GA-phage, fr- 
phage, AP205 phage, and Ty (such as retrotransposon Ty protein pi). VLPs are discussed further in 
WO 03/024480, WO 03/024481, and Refc. 26, 27, 28 and 29. Virosomes are discussed further in, for 

25 example, Ref. 30 

D. Bacterial or Microbial Derivatives 

Adjuvants suitable for use in the invention include bacterial or microbial derivatives such as: 

(1 ) Non-toxic derivatives of enterobacterial lipopolysaccharide (LPS) 

Such derivatives include Monophosphoryl lipid A (MPL) and 3-O-deacylated MPL (3dMPL). 
30 3dMPL is a mixture of 3 De-O-acylated monophosphoryl lipid A with 4, 5 or 6 acylated chains. A 
preferred "small particle" form of 3 De-O-acylated monophosphoryl lipid A is disclosed in EP 0 689 
454. Such "small particles" of 3dMPL are small enough to be sterile filtered through a 0.22 micron 
membrane (see EP 0 689 454). Other non-toxic LPS derivatives include monophosphoryl lipid A 
mimics, such as aminoalkyl glucosaminide phosphate derivatives e.g. RC-529. See Ref. 3 1 . 

35 (2) Lipid A Derivatives 

Lipid A derivatives include derivatives of lipid A from Escherichia coli such as OM- 1 74. OM-1 74 is 
described for example in Ref. 32 and 33. 

(3) Immunostimulatory oligonucleotides 
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Immunostimulatory oligonucleotides suitable for use as adjuvants in the invention include nucleotide 
sequences containing a CpG motif (a sequence containing an unmethylated cytosine followed by 
guanosine and linked by a phosphate bond). Bacterial double stranded RNA or oligonucleotides 
containing palindromic or poly(dG) sequences have also been shown to be immunostimulatory. 

5 The CpG's can include nucleotide modifications/analogs such as phosphorothioate modifications and 
can be double-stranded or single-stranded. Optionally, the guanosine may be replaced with an analog 
such as 2 > -deoxy-7-deazaguanosine. See ref. 34, WO 02/26757 and WO 99/62923 for examples of 
possible analog substitutions. The adjuvant effect of CpG oligonucleotides is further discussed in 
Refs. 35, 36, WO 98/40100, U.S. Patent No. 6,207,646, U.S. Patent No. 6,239,1 16, and U.S. Patent 

10 No. 6,429,199. 

The CpG sequence may be directed to TLR9, such as the motif GTCGTT or TTCGTT. See ref. 37. 
The CpG sequence may be specific for inducing a Thl immune response, such as a CpG-A ODN, or it 
may be more specific for inducing a B cell response, such a CpG-B ODN. CpG-A and CpG-B ODNs 
are discussed in refs. 38, 39 and WO 01/95935. Preferably, the CpG is a CpG-A ODN. 

1 5 Preferably, the CpG oligonucleotide is constructed so that the 5' end is accessible for receptor 

recognition. Optionally, two CpG oligonucleotide sequences may be attached at their 3 ' ends to form 
"immunomers". See, for example, refs. 40, 41, 42 and WO 03/035836. 

(4) ADP-ribosylating toxins and detoxified derivatives thereof. 

Bacterial ADP-ribosylating toxins and detoxified derivatives thereof may be used as adjuvants in the 
20 invention. Preferably, the protein is derived from £. coli (i.e., E. coli heat labile enterotoxin "LT), 
cholera ( U CT), or pertussis ("PT"). The use of detoxified ADP-ribosylating toxins as mucosal 
adjuvants is described in WO 95/1721 1 and as parenteral adjuvants in WO 98/42375. Preferably, the 
adjuvant is a detoxified LT mutant such as LT-K63. 

E. Human Immunomodulators 

25 Human immunomodulators suitable for use as adjuvants in the invention include cytokines, such as 
interleukins (e.g. IU, IL-2, IL-4, IL-5, IL-6, IL-7, IL-12, etc.), interferons {e.g. interferon--)), 
macrophage colony stimulating factor, and tumor necrosis factor. 

F. Bioadhesives and Mucoadhesives 

Bioadhesives and mucoadhesives may also be used as adjuvants in the invention. Suitable 
30 bioadhesives include esterified hyaluronic acid microspheres (Ref. 43) or mucoadhesives such as 
cross-linked derivatives of poly(acrylic acid), polyvinyl alcohol, polyvinyl pyrollidone, 
polysaccharides and carboxymethylcellulose. Chitosan and derivatives thereof may also be used as 
adjuvants in the invention. E.g., ref. 44. 

G. Microparticles 

35 Microparticles may also be used as adjuvants in the invention. Microparticles (Le. a particle of 

-lOOnm to ~150jim in diameter, more preferably -200nm to -30/un in diameter, and most preferably 

-500nm to -10/un in diameter) formed from materials that are biodegradable and non-toxic (eg. a 

poly(of-hydroxy acid), a polyhydroxybutyric acid, a polyorthoester, a polyanhydride, a 

polycaprolactone, etc.), with poly(lactide-co-glycolide) are preferred, optionally treated to have a 
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negatively-charged surface (eg. with SDS) or a positively-charged surface (e.g. with a cationic 
detergent, such as CTAB). 

H. Liposomes 

Examples of liposome formulations suitable for use as adjuvants are described in U.S. Patent No. 

5 6,090,406, U.S. Patent No. 5,916,588, and EP 0 626 169. 

L Polvoxvethvlene ether and Polvoxvethvlene Ester Formulations 

Adjuvants suitable for use in the invention include polyoxyethylene ethers and polyoxyethylene 
esters. Ref. 45. Such formulations further include polyoxyethylene sorbitan ester surfactants in 
combination with an octoxynol (Ref. 46) as well as polyoxyethylene alkyl ethers or ester surfactants 
10 in combination with at least one additional non-ionic surfactant such as an octoxynol (Ref. 47). 

Preferred polyoxyethylene ethers are selected from the following group: polyoxyethylene-9-lauryl 
ether (laureth 9), polyoxyethylene-9-steoryl ether, polyoxytheylene-8-steoryl ether, polyoxyethylene- 
4-lauryl ether, polyoxyethylene-35-lauryl ether, andpolyoxyethylene-23-lauryl ether. 

J. Polvphosphazene (PCPP) 

1 5 PCPP formulations are described, for example, in Ref. 48 and 49. 

K. Muramvl peptides 

Examples of muramyl peptides suitable for use as adjuvants in the invention include N-acetyl- 
muramyl-L-threonyl-D-isoglutamine (thr-MDP), N-acetyl-normuramyl-L-alanyl-D-isoglutamine (nor- 
MDP), and N-acetylmuramyl-L-alanyl-D-isoglutaminy^ 
20 hydroxyphosphoryloxy)-ethylainine MTP-PE). 

L. Imidazoquinolone Compounds . 

Examples of imidazoquinolone compounds suitable for use adjuvants in the invention include 
Imiquamod and its homologues, described further in Ref 50 and 51. 

The invention may also comprise combinations of aspects of one or more of the adjuvants identified 
25 above. For example, the following adjuvant compositions may be used in the invention: 

(1) a saponin and an oil-in-water emulsion (ref. 52); 

(2) a saponin (e.g.., QS21) + a non-toxic LPS derivative (e.g., 3dMPL) (see WO 
94/00153); 

(3) a saponin (e.g.., QS21) + a non-toxic LPS derivative (e.g., 3dMPL) + a cholesterol; 
30 (4) a saponin {e.g. QS21) + 3dMPL + IL-12 (optionally + a sterol) (Ref. 53); 

combinations of 3dMPL with, for example, QS21 and/or oil-in-water emulsions (Ref 54); 

(5) SAF, containing 10% Squalane, 0.4% Tween 80, 5% pluronic-block polymer L121, 
and thr-MDP, either microfluidized into a submicron emulsion or vortexed to generate a larger 
particle size emulsion. 

35 (6) Ribi™ adjuvant system (RAS), (Ribi Immunochem) containing 2% Squalene, 0.2% 

Tween 80, and one or more bacterial cell wall components from the group consisting of 
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monophosphorylipid A (MPL), trehalose dimycolate (TDM), and cell wall skeleton (CWS), 
preferably MPL + CWS (Detox™); and 

(7) one or more mineral salts (such as an aluminum salt) + a non-toxic derivative of LPS 
(such as 3dPML). 

5 Aluminium salts and MF59 are preferred adjuvants for parenteral immunisation. Mutant bacterial 
toxins are preferred mucosal adjuvants. 

The composition may include an antibiotic. 
Further antigens 

The compositions of the invention may further comprise one or more additional non-GAS antigens, 
10 including additional bacterial, viral or parasitic antigens. 

In one embodiment, the GAS antigen combinations of the invention are combined with one or more 
additional, non-GAS antigens suitable for use in a paediatric vaccine. For example, the GAS antigen 
combinations may be combined with one or more antigens derived from a bacteria or virus selected 
from the group consisting of N. meningitidis (including serogroup A, B, C, W135 and/or Y), 
15 Streptococcus pneumoniae, Bordetelta pertussis, Moraxella catarrhalis, Tetanus, Diphtheria, 
Respiratory Syncytial virus ORSV), polio, measles, mumps, rubella, and rotavirus. 

In another embodiment, the GAS antigen combinations of the invention are combined with one or 
more additional, non-GAS antigens suitable for use in a vaccine designed to protect elderly or 
immunocomprised individuals. For example, the GAS antigen combinations may be combined 
20 with an antigen derived from the group consisting of Enterococcus faecalis, Staphylococcus 
aureus, Staphylococcus epidermis, Pseudomonas aeruginosa, Legionella pneumophila, Listeria 
monocytogenes, influenza, and Parainfluenza virus ('PIV'). 

Where a saccharide or carbohydrate antigen is used, it is preferably conjugated to a carrier protein in 
order to enhance immunogenicity {e.g. refs. 55 to 64}. Preferred carrier proteins are bacterial toxins 

25 or toxoids, such as diphtheria or tetanus toxoids. The CRM197 diphtheria toxoid is particularly 

preferred {65}. Other carrier polypeptides include the ^meningitidis outer membrane protein {66} , 
synthetic peptides {67, 68}, heat shock proteins {69, 70}, pertussis proteins {71, 72}, protein D from 
^influenzae {73}, cytokines {74}, lymphokines, hormones, growth factors, toxin A or B from 
Cdifficile {75}, iron-uptake proteins {76}, etc. Where a mixture comprises capsular saccharides from 

30 both serogroups A and C, it may be preferred that the ratio (w/w) of MenA sacchariderMenC 
saccharide is greater than 1 (eg. 2:1, 3:1, 4:1, 5:1, 10:1 or higher). Different saccharides can be 
conjugated to the same or different type of carrier protein. Any suitable conjugation reaction can be 
used, with any suitable linker where necessary. 

Toxic protein antigens may be detoxified where necessary eg. detoxification of pertussis toxin by 
35 chemical and/or genetic means. 
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Where a diphtheria antigen is included in the composition it is preferred also to include tetanus 
antigen and pertussis antigens. Similarly, where a tetanus antigen is included it is preferred also to 
include diphtheria and pertussis antigens. Similarly, where a pertussis antigen is included it is 
preferred also to include diphtheria and tetanus antigens. 

5 Antigens in the composition will typically be present at a concentration of at least 1 figtoil each. In 
general, the concentration of any given antigen will be sufficient to elicit an immune response against 
that antigen. 

As an alternative to using protein antigens in the composition of the invention, nucleic acid encoding 
the antigen may be used {eg refs. 77 to 85}. Protein components of the compositions of the 
1 0 invention may thus be replaced by nucleic acid (preferably DNA eg. in the form of a plasmid) that 
encodes the protein. 

Definitions 

» 

The term "comprising" means "including" as well as "consisting" eg. a composition "comprising" X 
may consist exclusively of X or may include something additional eg. X + Y. 

1 5 The term "about" in relation to a numerical value x means, for example, x±\ 0%. 

References to a percentage sequence identity between two amino acid sequences means that, when 
aligned, that percentage of amino acids are the same in comparing the two sequences. This alignment 
and the percent homology or sequence identity can be determined using software programs known in 
the art, for example those described in section 7.7.18 of reference 86. A preferred alignment is 
20 determined by the Smith-Waterman homology search algorithm using an affine gap search with a gap 
open penalty of 1 2 and a gap extension penalty of 2, BLOSUM matrix of 62. The Smith-Waterman 
homology search algorithm is disclosed in reference 87. 

The following example demonstrates one way of preparing recombinant GAS antigens of the 
invention and testing their efficacy in a murine model. 

25 EXAMPLE 1 : Preparation of recombinant GAS antigens 

of the invention and Demonstration of Efficacy in Murine Model. 

Recombinant GAS proteins corresponding to two or more of the GAS antigens of the first antigen 
group are expressed as follows. 

30 1. Cloning of GAS antigens for expression in E. coli 

The selected GAS antigens were cloned in such a way to obtain two different kinds of 
recombinant proteins: (1) proteins having an hexa-histidine tag at the carboxy-terminus (Gas-His) 
and (2) proteins having the hexa-histidine tag at the carboxy-terminus and GST at the amino- 
terminus (Gst-Gas-His). Type (1) proteins were obtained by cloning in a pET21b+vector 
35 (available from Novagen). The type (2) proteins were obtained by cloning in a pGEX-NNH 



vector. This cloning strategy allowed for the GAS genomic DNA to be used to amplify the 
selected genes by PCR, to perform a single restriction enzyme digestion of the PCR products and 
to clone then simultaneously into both vectors. 

(a) Construction of pGEX-NNH expression vectors 

* 

5 Two couples of complementary oligodeoxyribonucleotides are synthesised using the DNA synthesiser 
ABI394 (Perkin Elmer) and reagents from Cruachem (Glasgow, Scotland). Equimolar amounts of the 
oligo pairs (50 ng each oligo) are annealed in T4 DNA ligase buffer (New England Biolabs) for 10 
min in a final volume of 50 jd and then left to cool slowly at room temperature. With the described 
procedure the following DNA linkers are obtained: 

10 gexNN linker 

Ndel Nhel Xmal EcoRI Ncol Sail Xhol SacI 

GATCCCATATGGCTAGCCCGGGGAATTCGTC 

GGTATACCGATCGGGCCCCTTAAGCAGGTACCTCACTCAGCTGACTGAGCTCACTAGCTCGAG 

15 NotI 

CTGAGCGGCCGCATGAA 
GACTCGCCGGCGTACTTTCGA 

gexNNH linker 

20 Hindlll Noll Xhol Hexa-Histidine 

TCGACAAGCTTGCGGCCGCACTCGAGCATCACCATCACCATCACTGAT 

GTTCGAACGCCGGCGTGAGCACGTAGAGGTAGTGGTAGTGACTATCGA 

The plasmid pGEX-KG [K. L. Guan and J. E. Dixon, Anal Biochem. 192, 262 (1991)] is digested 
25 with BamHI and Hindlll and 100 ng is lighted overnight at 16 °C to the linker gexNN with a molar 
ratio of 3:1 linker/plasmid using 200 units of T4 DNA Hgase (New england Biolabs). After 
transformation of the ligation product in E. coli DH5, a clone containing the pGEX-NN plasmid, 
having the correct linker, is selected by means of restriction enzyme analysis and DNA sequencing. 
The new plasmid pGEX-NN is digested with Sail and Hindlll and ligated to the linker gexNNH. After 
30 transformation of the ligation product in E. coli DH5, a clone containing the pGEX-NNH plasmid, 
having the correct linker, is selected by means of restriction enzyme analysis and DNA sequencing. 

(b) Chromosomal DNA preparation 

GAS SF370 strain is grown in THY medium until OD^ is 0.6-0.8. Bacteria are then centrifuged, 
suspended in TES buffer with lyzozyme (lOmg/ml) and mutanolysine (10U/nl) and incubated 1 hr at 
35 37° C. Following treatment of the bacterial suspension with RNAase, Proteinase K and 10% 

Sarcosyl/EDTA, protein extraction with saturated phenol and phenol/chloroform is carried out. The 
resulting supernatant is precipitated with Sodium Acetate/Ethanol and the extracted DNA is pelletted 
by centrifugation, suspended in Tris buffer and kept at -20° C. 
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(c) Oligonucleotide design 
Synthetic oligonucleotide primers are designed on the basis of the coding sequence of each GAS 
antigen using the sequence of Streptococcus pyogenes SF370 Ml strain. Any predicted signal peptide 
is omitted, by deducing the 5' end amplification primer sequence immediately downstream from the 
5 predicted leader sequence. For most GAS antigens, the 5' tail of the primers (see Table 1 , below) 
include only one restriction enzyme recognition site (Ndel, or Nhel, or Spel depending on the gene's 
own restriction pattern); the 3' primer tails (see Table 1) include a Xhol or a NotI or a Hindm 
restriction site. 



5' tails 


3' tails 


Ndel 5' GTGCGTCATATG 3' 


Xhol 5* GCGTCTCGAG3' 


Nhel 5' GTGCGTGCTAGC 3' 


NotI 5' ACTCGCTAGCGGCCGC 3' 


Spel 5 ' GTGCGTACTAGT 3 ' 


Hindm 5' GCGTAAGCTT 3' 



Table 1 . Oligonucleotide tails of the primers used to amplify genes encoding selected GAS 
10 antigens. 

As well as containing the restriction enzyme recognition sequences, the primers include nucleotides 
which hybridize to the sequence to be amplified. The number of hybridizing nucleotides depends on 
the melting temperature of the primers which can be determined as described [(Breslauer et al., Proc. 
Nat. Acad. Sci. 83, 3746-50 (1986 )). The average melting temperature of the selected oligos is 50-55 

15 °C for the hybridizing region alone and 65-75 °C for the whole oligos. Oligos can be purchased from 
MWG-Biotech S.p.A. (Firenze, Italy). 

(d) PCR amplification 
The standard PCR protocol is as follows: 50 ng genomic DN A are used as template in the presence of 
0,2 /iM each primer, 200 /iM each dNTP, 1,5 mM MgCl 2 , Ix PCR buffer minus Mg (Gibco-BRL), 

20 and 2 units of Taq DNA polymerase (Platinum Taq, Gibco-BRL) in a final volume of 1 00 pi. Each 

sample undergoes a double-step amplification: the first 5 cycles are performed using as the 

hybridizing temperature of one of the oligos excluding the restriction enzyme tail, followed by 25 

cycles performed according to the hybridization temperature of the whole length primers. The 

standard cycles are as follows: 

25 one cycle: 

denaturation : 94 °C, 2 min 

5 cycles: 

denaturation: 94 °C, 30 seconds, hybridization: ^1 °C, 50 seconds, elongation: 72 °C, 1 min or 
30 2 min and 40 sec 

25 cycles: 

denaturation: 94 °C, 30 seconds 
hybridization: 70 °C, 50 seconds 
35 elongation: 72 °C, 1 min or 2 min and 40 sec 
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72 °C, 7 min 
4°C 

The elongation time is 1 min for GAS antigens encoded by ORFs shorter than 2000 bp, and 2 min and 
40 seconds for ORFs longer than 2000 bp. The amplifications are performed using a Gene Amp PCR 

5 system 9600 (Perkin Elmer). 

To check the amplification results, 4 \i\ of each PCR product is loaded onto 1 -1 .5 agarose gel and the 
size of amplified fragments compared with DNA molecular weight standards (DNA markers III or K, 
Roche). The PCR products are loaded on agarose gel and after electrophoresis the right size bands are 
excised from the gel. The DNA is purified from the agarose using the Gel Extraction Kit (Qiagen) 

10 following the instruction of the manufacturer. The final elution volume of the DNA is 50 /d TE (10 
mM Tris-HCl, 1 mM EDTA, pH 8). One /d of each purified DNA is loaded onto agarose gel to 
evaluate the yield 

(e) Digestion of PCR fragments 

One-two /ig of purified PCR products are double digested overnight at 37 °C with the appropriate 
1 5 restriction enzymes (60 units of each enzyme) using the appropriate restriction buffer in 100 /d final 
volume. The restriction enzymes and the digestion buffers are from New England Biolabs. After 
purification of the digested DNA (PCR purification Kit, Qiagen) and elution with 30 jd TE, 1 jd is 
subjected to agarose gel electrophoresis to evaluate the yield in comparison to titrated molecular 
weight standards (DNA markers III or IX, Roche). 

20 (f) Digestion of the cloning vectors (pET21b+ and pGEX-NNH) 

10 fig of ptasmid is double digested with 100 units of each restriction enzyme in 400 pi reaction 
volume in the presence of appropriate buffer by overnight incubation at 37 °C. After electrophoresis 
on a 1 % agarose gel, the band corresponding to the digested vector is purified from the gel using the 
Qiagen Qiaex II Gel Extraction Kit and the DNA was eluted with 50 /d TE. The DNA concentration 

25 is evaluated by measuring OD 2 «> of the sample. 

(g) Cloning of the PCR products 
Seventy five ng of the appropriately digested and purified vectors and the digested and purified 
fragments corresponding to each selected GAS antigen are ligated in final volumes of 10-20 /il with a 
molar ratio of 1 : 1 fragment/vector, using 400 units T4 DNA ligase (New England Biolabs) in the 

30 presence of the buffer supplied by the manufacturer. The reactions are incubated overnight at 1 6 °C. 
Transformation oiEcoli BL21 (Novagen) and Ecoli BL21-DE3 (Novagen) electrocompetent cells is 
performed using pGEX-NNH ligations and pET21b+ ligations respectively. The transformation 
procedure is as follows: 1-2 (il the ligation reaction is mixed with 50 fil of ice cold competent cells, 
then the cells are poured in a gene pulser 0. 1 cm electrode cuvette (Biorad). After pulsing the cells in 

35 a MicroPulser electroporator (Biorad) following the manufacturer instructions the cells are suspended 

in 0.95 ml of SOC medium and incubated for 45 min at 37 °C under shaking. 100 and 900 /il of cell 

suspensions are plated on separate plates of agar LB 100 /ig/ml Ampicillin and the plates are 
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incubated overnight at 37 °C. The screening of the transformants is done by PCR: randomly chosen 
transfoimants are picked and suspended in 30 yl of PCR reaction mix containing the PCR buffer, the 
4 dNTPs, 1,5 mM MgCl 2 . Taq polymerase and appropriate forward and reverse oligonucleotide 
primers that are able to hibridize upstream and downstream from the polylinker of pET21b+ or 
5 pGEX-NNH vectors. After 30 cycles of PCR, 5 y\ of the resulting products are run on agarose gel 
electrophoresis in order to select for positive clones from which the expected PCR band is obtained. 
PCR positive clones are chosen on the basis of the correct size of the PCR product, as evaluated by 
comparison with appropriate molecular weight markers (DNA markers III or DC, Roche). 

2. Protein expression 

10 PCR positive colonies are inoculated in 3 ml LB 100 /ig/ml Ampicillin and grown at 37 °C overnight. 
70 fil of the overnight culture is inoculated in 2 ml LB/Amp and grown at 37 °C until ODeoo of the 
pET clones reached the 0,4-0,8 value or until OD^ of the pGEX clones reached the 0,8-1 value. 
Protein expression is then induced by adding 1 mM IPTG (Isopropil /3-D thio-galacto-piranoside) to 
the mini-cultures. After 3 hours jncubation at 37 °C the final OD^o is checked and the cultures are 

15 cooled on ice. After centrifiigation of 0.5 ml culture, the cell pellet is suspended in 50 /d of protein 
Loading Sample Buffer (60 mM TRIS-HC1 pH 6.8, 5% w/v SDS, 10% v/v glycerin, 0.1% w/v 
Bromophenol Blue, 100 mM DTT) and incubated at 100 °C for 5 min. A volume of boiled sample 
corresponding to 0. 1 OD600 culture is analysed by SDS-PAGE and Coomassie Blue staining to verify 
the presence of induced protein band. 

20 3. Purification of the recombinant proteins 

Single colonies are inoculated in 25 ml LB 100 pg/ml Ampicillin and grown at 37 °C overnight. The 
overnight culture is inoculated in 500 ml LB/Amp and grown under shaking at 25 °C until ODeoo 0.4- 
0.7. Protein expression is then induced by adding 1 mM IPTG to the cultures. After 3.5 hours 
incubation at 25 °C the final OD^ is checked and the cultures are cooled on ice. After centrifiigation 

25 at 6000 rpm (JA10 rotor, Beckman), the cell pellet is processed for purification or frozen at -20° C. 

(a) Procedure for the purification of soluble His-tagged proteins from E.coli 
( 1 ) Transfer the pellets from -20°C to ice bath and reconstitute with 1 0 ml 50 mM NaHP0 4 buffer, 
300 mM NaCl, pH 8,0, pass in 40-50 ml centrifiigation tubes and break the cells as per the following 

» 

outline. 

30 (2) Break the pellets in the French Press performing three passages with in-line washing. 

(3) Centrifuge at about 30-40000 x g per 15-20 min. If possible use rotor JA 25.50 (21000 ipm, 15 
min.) or JA-20 (18000 rpm, 15 min.) 

(4) Equilibrate the Poly-Prep columns with 1 ml Fast Flow Chelating Sepharose resin with 50 mM 
phosphate buffer, 300 mM NaCl, pH 8,0. 

35 (5) Store the centrifiigation pellet at -20°C, and load the supernatant in the columns. 
(6) Collect the flow through. 
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(7) Wash the columns with 10 ml (2 ml + 2 ml + 4 ml) 50 mM phosphate buffer, 300 mM NaCl, pH 
8.0. 

(8) Wash again with 10 ml 20 mM imidazole buffer, 50 mM phosphate, 300 mM NaCl, pH 8.0. 

(9) Hute the proteins bound to the columns with 4.5 ml (1 .5 ml + 1 .5 ml + 1 .5 ml) 250 mM imidazole 
5 buffer, 50 mM phosphate, 300 mM NaCl, pH 8,0 and collect the 3 corresponding fractions of ~1 .5 ml 

each. Add to each tube 1 5 ftl DTT 200 mM (final concentration 2 mM) 

(10) Measure the protein concentration of the first two fractions with the Bradford method, collect a 
10 jig aliquot of proteins from each sample and analyse by SDS-PAGE. (N.B.: should the sample be 
too diluted, load 21 jil + 7 p.1 loading buffer). 

10 (1 1) Store the collected fractions at +4°C while waiting for the results of the SDS-PAGE analysis. 

(12) For immunisation prepare 4*5 aliquots of 100 jig each in 0.5 ml in 40% glycerol. The dilution 

buffer is the above elution buffer, plus 2 mM DTT. Store the aliquots at -20°C until immunisation, 
(b) Purification of His-tagged proteins from Inclusion bodies 

Purifications are carried out essentially according the following protocol: 
15 (1 ) Bacteria are collected from 500 ml cultures by centrifugation. If required store bacterial pellets at 

-20°C. For extraction, resuspend each bacterial pellet in 10 ml 50 mM TRIS-HC1 buffer, pH 8,5 on 

an ice bath. 

(2) Disrupt the resuspended bacteria with a French Press, performing two passages. 

(3) Centrifuge at 35000 x g for 15 min and collect the pellets. Use a Beckman rotor JA 25.50 (21000 
20 rpm, 1 5 min.) or JA-20 (18000 rpm, 15 min.). 

(4) Dissolve the centrifugation pellets with 50 mM TRIS-HCl, 1 mM TCEP {Tris(2<arboxyethyl)- 
phosphine hydrochloride, Pierce} , 6M guanidium chloride, pH 8.5. Stir for ~ 10 min. with a magnetic 
bar. 

(5) Centrifuge as described above, and collect the supernatant. 

25 (6) Prepare an adequate number of Poly-Prep (Bio-Rad) columns containing 1 ml of Fast Flow 

Chelating Sepharose (Pharmacia) saturated with Nichel according to manufacturer recommendations.. 
Wash the columns twice with 5 ml of H 2 0 and equilibrate with 50 mM TRIS-HC1, 1 mM TCEP, 6M 
guanidinium chloride, pH 8.5. 

(7) Load the supernatants from step 5 onto the columns, and wash with 5 ml of 50 mM TRIS-Hcl 
30 buffer, 1 mM TCEP, 6M urea, pH 8.5 

(8) Wash the columns with 10 ml of 20 mM imidazole, 50 mM TRIS-HC1 , 6M urea, 1 mM TCEP, 
pH 8.5. Collect and set aside the first 5 ml for possible further controls. 

(9) Elute the proteins bound to the columns with 4.5 ml of a buffer containing 250 mM imidazole, 50 
mM TRIS-HC1, 6M urea, 1 mM TCEP, pH 8.5. Add the elution buffer in three 1 .5 ml aliquots, and 

35 collect the corresponding 3 fractions. Add to each fraction 15 |xl* DTT (final concentration 2 mM). 

(10) Measure eluted protein concentration with the Bradford method, and analyse aliquots of ca 10 
\xg of protein by SDS-PAGE. 



(11) Store proteins at -20°C in 40% (v/v) glycerol, 50 mM TRIS-HC1, 2M urea, 0.5 M arginine, 2 
raM DTT, 0.3 mM TCEP, 83.3 mM imidazole, pH 8.5. 

(c) Procedure for the purification of GST-fusion proteins from E.coli 

(1) Transfer the bacterial pellets from -20°C to an ice bath and suspend with 7,5 ml PBS, pH 7,4 to 
5 which a mixture of protease inhibitors (COMPLETE™ - Boehringer Mannheim, 1 tablet every 25 ml 

of buffer) has been added. 

(2) Transfer to 40-50 ml centrifijgation tubes and sonicate according to the following procedure: 

a. Position the probe at about 0,5 cm from the bottom of the tube 

b. Block the tube with the clamp 
10 c. Dip the tube in an ice bath 

d. Set the sonicator as follows: Timer Hold, Duty Cycle 55, Out. Control -» 6. 

e. perform 5 cycles of 10 impulses at a time lapse of 1 minute (i.e. one cycle = 10 impulses + -45" 
hold; b. 10 impulses + -45" hold; c. 10 impulses + -45" hold; d. 10 impulses + -45" hold; e. 10 
impulses + -45" hold), 

15 

(3) Centriftige at about 30-40000 x g for 15*20 min. E.g.: use rotor Beckman JA 25.50 at 21000 rpm, 
for 15 min. 

(4) Store the centrifugation pellets at -20°C, and load the supernatants on the chromatography 
columns, as follows 

20 (5) Equilibrate the Poly-Prep (Bio-Rad) columns with 0,5 ml (=1 ml suspension) of Glutathione- 
Sepharose 4B resin, wash with 2 ml (1 + 1) H 2 0, and then with 10 ml (2 + 4 + 4) PBS, pH 7,4. 

(6) Load the supernatants on the columns and discard the flow through. 

(7) Wash the columns with 10 ml (2 + 4 + 4) PBS, pH 7.4. 

(8) Elute the proteins bound to the columns with 4.5 ml of 50 mM TRIS buffer, 10 mM reduced 

25 glutathione, pH 8.0, adding 1.5 ml + 1.5 ml + 1.5 ml and collecting the respective 3 fractions of -1.5 
ml each. 

(9) Measure the protein concentration of the first two fractions with the Bradford method, analyse a 
10 ug aliquot of proteins from each sample by SDS-PAGE. (N.B.: if the sample is too diluted load 21 
ul (+ 7 ul loading buffer). 

30 (10) Store the collected fractions at +4°C while waiting for the results of the SDS-PAGE analysis. 
(11) For each protein destined to the immunisation prepare 4-5 aliquots of 100 ug each in 0.5 ml of 
40% glycerol. The dilution buffer is 50 mM TRIS.HC1, 2 mM DTT, pH 8.0. Store the aliquots at - 
20°C until immunisation. 

4. Murine Model of Protection from GAS Infection 
35 (a) Immunization protocol 

Groups of 10 CD 1 female mice aged between 6 and 7 weeks are immunized with two or more GAS 
antigens of the invention, (20 Mg of each recombinant GAS antigen), suspended in 100 fd of suitable 
solution. Each group receives 3 doses at days 0, 21 and 45. Immunization is performed through intra- 
peritoneal injection of the protein with an equal volume of Complete Freund's Adjuvant (CFA) for the 
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first dose and Incomplete Freund's Adjuvant (IFA) for the following two closes. In each immunization 
scheme negative and positive control groups are used. 

For the negative control group, mice are immunized with E. coli proteins eluted from the purification 
columns following processing of total bacterial extract from a E. coli strain containing either the 
5 pET21b or the pGEX-NNH vector (thus expressing GST only) without any cloned GAS ORF (groups 
can be indicated as HisStop or GSTStop respectively). 

For the positive control groups, mice are immunized with purified GAS M cloned from either GAS 
SF370 or GAS DSM 2071 strains (groups indicated as 192SF and 192DSM respectively). 
Pooled sera from each group is collected before the first immunization and two weeks after the last 
10 one. Mice are infected with GAS about a week after. 

Immunized mice are infected using a GAS strain different from that used for the cloning of the 
selected proteins. For example, the GAS strain can be DSM 2071 M23 type, obtainable from the 
German Collection of Microorganisms and Cell Cultures (DSMZ). 

For infection experiments, DSM 2071 is grown at 37° C in THY broth until ODa* 0.4. Bacteria are 

1 5 pelletted by centrifugation, washed once with PBS, suspended and diluted with PBS to obtain the 
appropriate concentrationof bacteria/ml and administered to mice by intraperitoneal injection. 
Between 50 and 100 bacteria are given to each mouse, as determined by plating aliquots of the 
bacterial suspension on 5 THY plates. Animals are observed daily and checked for survival. 
5. Analysis of Immune Sera 

20 (a) Preparation of GAS total protein extracts 

Total protein extracts are prepared by incubating a bacterial culture grown to OD^ 0.4-0.5 in Tris 
50mM pH 6.8/mutanolysin (20 units/ml) for 2 hr at 37° C, followed by incubation for ten minutes on 
ice in 0.24 N NaOH and 0.96% P-mercaptoethanol. The extracted proteins are precipitated by 
addition of trichloroaceticacid, washed with ice-cold acetone and suspended in protein loading buffer. 

25 (b) Western blot analysis 

Aliquots of total protein extract mixed with SDS loading buffer (lx: 60 mM TRIS-HC1 pH 6.8, 5% 
w/v SDS, 10% v/v glycerin, 0.1% Bromophenol Blue, 100 mM DTT) and boiled 5 minutes at 95° C, 
were loaded on a 12.5% SDS-PAGE precast gel (Biorad). The gel is run using a SDS-PAGE running 
buffer containing 250 mM TRIS, 2.5 mM Glycine and 0. 1 %SDS. The gel is electroblotted onto 

30 nitrocellulose membrane at 200 mA for 60 minutes. The membrane is blocked for 60 minutes with 
PBS/0.05 % Tween-20 (Sigma), 10% skimmed milk powder and incubated O/N at 4* C with 
PBS/0.05 % Tween 20, 1% skimmed milk powder, with the appropriate dilution of the sera. After 
washing twice with PBS/0.05 % Tween, the membrane is incubated for 2 hours with peroxidase- 
conjugated secondary anti-mouse antibody (Amersham) diluted 1:4000. The nitrocellulose is washed 

35 three times for 10 minutes with PBS/0.05 % Tween and once with PBS and thereafter developed by 
Opti-4CN Substrate Kit (Biorad). 

(c) Preparation of Paraformaldehyde treated GAS cultures 
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A bacterial culture grown to ODoo 0.4-0.5 is washed once with PBS and concentrated four times in 
PBS/0.05 % Paraformaldehyde. Following 1 hr incubation at 37° C with shacking, the treated culture 
is kept overnight at 4° C and complete inactivation of bacteria is then controlled by plating aliquots on 
THY blood agar plates. 

5 (d) FACS analysis of Paraformaldehyde treated GAS coltures with mouse immune sera 

About 10 5 Paraformaldehyde inactivated bacteria are washed with 200 fil of PBS in a 96 wells.U 
bottom plate and centrifuged for 10 min. at 3000g, at 4°C. The supernatant is discarded and the 
bacteria are suspended in 20 /zl of PBS-0. 1%BS A. Eighty ^1 of either pre-immune or immune mouse 
sera diluted in PBS-0. 1 %BSA are added to the bacterial suspension to a final dilution of either 1 : 100, 

10 . 1:250 or 1:500, and incubated on ice for 30 min. Bacteria are washed once by adding 100 jd of PBS- 
0.1%BSA, centrifuged for 10 min. at 3000g, 4°C, suspended in 200 \x\ of PBS-0.1%BSA, centrifuged 
again and suspended in 10 yl of Goat Anti-Mouse IgQ, F(ab') 2 fragment spetific-R-Phycoerythrin- 
conjugated (Jackson Immunoresearch Laboratories Inc., cat.N°l 15-1 16-072) in PBS-0.1%BSA to a 
final dilution of 1 : 100, and incubated on ice for 30 min. in the dark. Bacteria are washed once by 

15 adding 180 jd of PBS-0. 1%BSA and centrifuged for 10 min. at 3000g, 4°C. The supernatant is 

discarded and the bacteria were suspended in 200 /zl of PBS. Bacterial suspension is passed through a 
cytometric chamber of a FACS Calibur (Becton Dikinson, Mountain View, CA USA) and 10.000 
events are acquired. Data are analysed using Cell Quest Software (Becton Dikinson, Mountain View, 
CA USA) by drawing a morphological dot plot (using forward and side scatter parameters) on 

20 bacterial signals. An histogram plot is then created on FL2 intensity of fluorescence log scale 
recalling the morphological region of bacteria. 

It will be understood that the invention has been described by way of example only and 
modifications may be made whilst remaining within the scope and spirit of the invention. 
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